 Hello, Ember Khan. We have not already met. I'm Godfrey. You can find me on the internet as changing code. And as of a few weeks ago, I'm also now officially a YouTuber, so you can find me on changingcode.tv where you can watch me build completely useless stuff on the computer. If that sounds interesting to you, please do subscribe to the YouTube channel. And for my internet followers, I just want to point out that I do take your feedback very seriously. Last year, I gave a very, very serious talk of Yehuda here. And that was the feedback I got. So this year, I decided this software thing is not working out for me after all. But then Lea came to me, and she was like, hey, we need someone to entertain the crowd after lunch, but we couldn't afford a comedian. So I get interested. Sure, why not? I'll take it. And then I was thinking about, you know what would be hilarious? Like, what if I write an abstract for a fake talk and put it on the schedule, then no one would see this coming, right? So here we are. You didn't really think there was going to be a talk after lunch, right? OK, I do have a talk. I mean, I should probably say I had a talk, because this time I actually prepared for this in advance. And I wrote everything down on paper. And like, actual, you know, dead tree papers. And then I was thinking, oh, all I need to do is to make the slide after the speaker dinner yesterday. But then when I got home, Tom messaged me, and was like, ah, I got all your notes. So change of plans. Anyway, if you have been around long enough, you probably know that this is what I do here. I am the Canadian on the team. I represent my country. I am a Canadian ambassador. And that was me representing Canada on national TV, like US national TV a few years ago. But since last year, I noticed something different. Like these days, when I meet someone new, like before I bring up anything, usually before you even bring up anything, they would tell me, oh, like actually, I want to move to Canada. So I was like, wow, I probably did such a good job in the last few years, like they don't really need me anymore. So I am very proud to say I have promoted myself to a world ambassador this year. And as part of my job, I have traveled to quite a few places in the last few years and I, in the last year, and I would love to tell you about them. And because I'm a world ambassador, I would like to call it the Hello World to Work. But if you work for a VC-funded startup, you're probably more used to calling it an incredible journey, which is the same thing. So here you go. First stop, Taiwan. Lots of great food. I really recommend spending time in their beautiful countryside. The only weird thing is their government were on a trial version of Exchange Server, so their Outlook platform has very limited capacity. And you see these pop-up dialogues everywhere. Next, Singapore. Singapore is basically a tech hub in Asia these days. The government are really investing in technology. The latest tourist attraction is this thing called the thing of Cloud Forest, where you can watch the AWS farmers grow your Hiroku dinos on binary trees. So that was pretty interesting for me. And next up, Spain. I don't know a lot about a country, so I Googled it before I went. But I feel like if the first Google search for your country is your meal times, I don't really need to say too much to promote this country. And I also wanted to point out that this is actually not the complete list, because if you click that, there are at least one or two more meals in the article. So that's Spain for you in a nutshell. And then Australia. There was something I want to point out here, but I kind of forgot what it is. But I do remember that toilets flushed the other way around. So that's that. And then South Korea. The most impressive thing for me was that they have free pokey balls in their subway. And I couldn't read Korean, but then they have pictures to show you how to throw the pokey ball if you're new to this. And then in Japan, you probably know that Japanese people are very organized and they work very efficiently. The secret is that they have standardized everything. And you can go to these standard bookstores to their RFCs, ISOs, like SPACs and stuff. I got a copy of ESX when I was there in Japanese, so I can really read. But anyway, the last none of these, since I live here in Portland now, I figure I should give Portland a shout out as well. One of the best things about Portland is we're able to support a lot of local businesses. So while you're here, it would be great if you can help us out by hashtag shop loco. For example, my favorite clothing store is a local boutique called Nike. If you're craving for authentic Mexican cuisine, there's a little shop called Tagotime. And if you're visiting from another country, you really can't miss this traditional American restaurant in the Northwest industrial area. And yesterday I also learned they have the best coffee in town at 4 a.m. in the morning. And finally, if you're looking to buy some artisanal handcrafted loco software to take home with you, you can't go around with this tiny shop in downtown Portland. They're also hiring. Anyway, that's a wrap for our incredible journey. And if you happen to run a conference in your country and you are looking for someone to promote your city, please do get in touch. And with all the serious stuff out of the way, we can finally get into the comedy section. What I really want to talk about today is a thing I invented, which I call the hierarchy of speed. You might have heard of a similar concept in psychology or from today's keynote. This is where the psychologists and Tom, you're here to talk the inspiration from. At the bottom of my hierarchy, you have laws of physics and then hardware, various levels of kernel code, and then various levels of user line code. And on top of that, you have human factors. So the reason this is a hierarchy is because everything kind of builds on top of the layer below, right? So, and each layer adds a little bit of overhead. So, every time you improve a layer, everything on top of it automatically benefits from it. So for example, if you make a network, if you make the network faster, everything would just download quicker. And if you make the CPU faster, your app would automatically run faster as well without you having to do anything. On the other hand, the bottom layer also set an upper limit of how fast you can make the layers on top of it, right? So for example, the speed of light is constant, right? Like you can't change how physics work. So if you make a network round trip from here to Asia and back then it's gonna, even in the ideal world, it's gonna cost you like at least 70 milliseconds. Probably a lot more than that in practice. And it also turns out that we're getting pretty good at building these CPUs, right? So the amount of time it takes for a signal to travel from one corner to the other corner of the CPU actually turns out to be placing a bottleneck on a bottleneck or an upper bound on how fast we can make these CPUs. So that being said, as a JavaScript developer, it's quite unlikely that you have to go all the way to the bottom of this pyramid. And if your ember app is slow, it's quite unlikely that you can get away with blaming physics. So instead of looking at the whole thing, we focus on a small part of Triangle, which I call the hierarchy of JavaScript performance. So, at the bottom, you have JavaScript engines like V8, right, on top of that. You have libraries and framework you use like ember. And on top of that sits your application code. Unfortunately, I only have 30 minutes and I already blew half of it on jokes. So I can't go very deep today. And so instead of this is gonna be a growing pour where you get a like taste test of everything, but we can't go, like we can't get to a lot of depth for each topic. So my job here is primarily to give you a mental framework to think about performance and we will try to take that framework and apply it to each of the layers here. And finally, we'll pop back out and talk about the big picture and how these things interact with each other. So let's start with the mental framework. So in my opinion, there are two ways to analyze this, the first way is to look at this from the perspective of your time budget. So let's say you have decided it's really important for your app to render in like under a second, right? So, or a thousand milliseconds, right? So a question you can ask yourself is how many operations can you do at each of the layers before you blew half of your budget? So it's not very scientific, real thumb and I do apologize for using Matt right here. The things you do in JavaScript engine tend to be measured in nanoseconds and the things you do in your library tends to be measured in microseconds and things you do the high level operations you do in your application code tends to be measured in milliseconds. So if you do the math on that, then let's say you have an operation that takes 50 milliseconds in your application, right? Then you only have to do it 10 times before you blew half of your budget. On the other hand, if it's a 50 microsecond operation in inside Amber, let's say, you would have to do 10,000 times and likewise, if you have a 50 nanosecond operation in the engine, it would basically run at 10 million times to have the same effects. So to help you visualize it, let's say you have a pretty complicated component that takes 15 milliseconds to render. Maybe it's a news feed item that use a few other components, do some string localization, formatting and stuff like that. So if you put 10 of those on your screen, then that would be half of your one second budget right there. So doing this have a very significant effect on your bottom line. On the other hand, let's say you measure the overhead of reading a computer property in Amber, it's say 15 microseconds, right? So in this case, you would have to do that 10,000 times to have the same effect on your app. And finally, let's say you're worried about the cost of allocating an object in JavaScript, right? So you measure that, turns out to be 15 nanoseconds or so. Which means you have to do it 10 million times to have the same effect. And to put that in perspective, this is what it looks like. So I literally ran out of pixels on the other screens up to scale everything. Roughly five times the pixels that was on the previous slide. So that's how it compares to everything. So now there are some caveats here, right? For example, rendering component. The first time is always much more expensive because you have to load the code, you have to parse the code, you have to warm the catchers and stuff like that. So you can't really just multiply the numbers like that. But nevertheless, it's still a pretty good rule of thumb. And the more of the story is it's always the case that you can make the most impact at the upper layers of the pyramid. So that's the time-budget-based analysis. The other way to look at it is look at a micro versus macro split. In my opinion, macro means do fewer things overall. And micro means do the same thing faster. So let's say you have a loop that looks like this. The micro approach will be to ask, wow, holy crap, why is the sleep taking a millisecond? That seems like a real long time. The macro approach is like, wait a minute, why are we sleeping 10,000 times? Or why are we sleeping in our app at all? Now in the real world, this is not gonna be so straightforward, right? Like we'll look at some examples later, but this is the rough idea of what I'm getting at. And the other way to understand micro versus macro split is to base it on your persona, right? So sometimes you're an application developer, sometimes you are working on your open source libraries, right? So you're putting on different hats or working on different layers in those cases. So the way, oh look at this, macro basically means improving your algorithmic properties in the current layer that you're working at. Basically how can I do fewer things overall in my app versus micro would be how can I be smarter in the ways that I use the layer below? So that's the framework part. So now we can try applying this to the layers we have on the left. So let's start with the application layer. If you recall, we said that things are generally measured in milliseconds here. The hypothetical example we have was that a component takes 50 milliseconds to render. However, the lowest hang for usually is things like a network request, which is usually a very easy way to blow tens or hundreds of milliseconds in your budget. So if you can find ways to eliminate that, do it. If you can use shoebox, do that. If you can use service worker, great, do that. But so those problems are the lowest hang fruit, but then those are also pretty well understood. So today I want to focus on a lesser known part of that problem, which I call hidden loops. So in particular, I would like to call two specific cases of this problem, the big data problem and the backtracking problem. So let's see. So what I mean by big data is that when you are developing a feature, you are probably working with a limited set of mock data in your local machine or like a mock account on the staging server. So everything is pretty fast in SnapBee. However, as soon as you deploy that to production, you might be surprised that some of your customers actually have a lot more data than you expect. So now suddenly your app cannot queue up or perhaps you're building a new product. So at the beginning, you don't really have any data in there. So things, of course, are very fast. But over time, your customers added a lot of data to your database and your app over time lost the ability to keep up with that. So that's the big data problem. And having a lot of data is intrinsically bad because you need to download those data and parse them. So that's one of the problems. But even if those times look as acceptable to you, they're still in their pretty dangerous aspect of that, which is the way they compound. So this is the time that you wish you pay attention in your math class. So let's say you're rendering 100 model, which is perhaps a little bit too much. But this isn't sound catastrophic yet. But however, imagine to render each of those model, you need to render a computed property, which compares that single model to every other model. So then suddenly, without noticing it, we are back to the 10,000 magic number, which, if you recall, is the number of times you have to call a 50 microsecond operation to have the same effect as your many components. So that's one of the way that this could creep up on you. And the other favor of the problem, which I will do briefly, is the backtracking problem, which is invalidating already flushed content. So let's say you have a form. So you have a date that you're rendering, and then you render a date picker to change that date. So however, in the receive adders hook on the date picker component, you have some validation logic, right? So oh, this is February 30th, and that date doesn't exist. So I need to do something about it, and you set it back to default. However, you will notice that on the left, in the parent component, we have already rendered the date field. And now you're setting like this. This is a little bit tricky, because this is bubbling up via two-way bindings. But there are also other ways for this to manifest. So the general theme is you already rendered something on the screen, and now you're setting it again in the same render cycle. So the only thing we could do at that point is to complete the current render loop and then reschedule the entire render top to bottom again, which basically doubles your work. So that's a backtracking problem. In Malapena, these are the kind of problems that the framework should catch for you. So since 210, this is now an error, and we got much better at catching these cases, and we have a much better error message for these thanks to Gevin from Intercom. So that's that. So let's recap. So I think macro is, in general, the approach to handle that is pretty much the same everywhere. If you can avoid doing the work in the first place, then don't do it. If you can reuse the work by either hoisting up the common operations or by adding catches, do that, like another way to deal with it is if you don't immediately need to work because it's below the fold or you don't need it until the user tries to interact with it, then try to defer it until you actually need it. Then if you run out of macro things to do, then the micro things you're going to do here is basically, as I said, this is about how we can use the layer below smarter. So in this case, how can we help Ember out to be more performant? So generally, you shouldn't do these things, but in hot paths and you have no other ways to do it, you can, if it turns out the problem is that rendering a component is too expensive for some reason, then one of the workaround is you can use a helper that returns a DOM node instead of using a component. Again, usually this is not necessary, but if it comes down to it, like if you have a very hot loop, then this is something you can consider. Likewise, there's the Unbound Helper, which is basically hinting at Glimmer to be smarter about how it handled the content there. And the third one is not really available yet, but I have an RC that allows you to create custom components that are more tailored to your use case, which might be helpful in cases where performance is super critical. So before we wrap up this layer, let's talk about some of the tools that you have at your disposal to look at the problems at this layer. So the first one is the network tab in Chrome. The second one is time... Actually, the network tab is basically in every browser. The timeline is more specifically to Chrome, and the third one is the HTML5 user timing API, which is, as far as I know, implemented in most browsers. So this is just very quickly. This is the network tab in the Chrome Inspector, basically, to show you the network request you're making, which is, again, the lowest-hanging for it usually. And then this is the timeline view, which is a little bit more complicated, so it takes some time to understand it. But it basically tells you how many frames are you rendering per second, how much CPU you're using, you can do JavaScript profile, and things like that. And finally, the user timing API is basically you can do window.performance.mark and things like that to insert custom points that you're interested in apps, and you can calculate the time you spend in particular things. The nice thing about this is actually, if you use the user timing API, it would actually show the marker in the Chrome timeline view, which is pretty nice and pretty helpful. So, like I said, we don't have a lot of time to get into details, but if you have specific questions, we can talk about it after the talk. So the second layer is the library's layer. So the hypothetical example I gave, and this is not the real number, but let's say we're worried about the kind of operations like ember.get or ember.matter or ember.set, those kind of low-level operations in ember. So the thing about libraries is they're written in JavaScript, right? So it's not that different from your code. The main difference is these things tend to be called a lot more times than your code, right? So your component, if you're rendering 10 of them, 100 of them, that's many, but it would not be surprising at all if ember.get is called tens or hundreds of thousands of times in a single render. And also these code tend to be quite generic in the sense that they're doing a very general purpose thing. So sometimes it's very difficult to think about ways that you can optimize it because the API is so generic. So again, the macro strategy at this layer is basically the same thing. Don't do the work, reuse the work, or defer the work. And I have some recent examples that I cannot, again, go into a lot of details, but one of the things that came to mind is recently, or up until recently, the instantiating the first instance of a class in ember is somewhat expensive because we have to extend your class two times to add the factory injections. So that's known as a double extend problem. It turns out the only reason we have to do that is to support private API code and score a local factory. So what we did is we basically deprecated that private API and expose a new public API code factory for, that doesn't have the same problem. So that happened recently, and I think as of the current beta, we have the new API, well, the new API landed in the last, in 2.12, basically. Because we landed that in 2.12, in 2.13 and above, we were able to eliminate the cost of the double extend, which is pretty nice. Another one by Robert is to basically make some of the lesser used features pay as you go. And finally, on the Ember data team have been doing a lot of work on these kind of issues. And as you can see, there are a lot of PRs that was recently merged in some part of Ember data in the last release or so, got basically roughly twice as fast. So micro-optimization of this layer requires understanding the engine, which is what we're getting to next. So I will skip that for now. The tools for this layer is basically CPU profilers or sampling profiler, flame graphs, and a library that the Ember data team was using to find all of those issues called Hamdell. So this is the CPU profiler in Chrome, which is basically the same thing as the JS profile and the timeline view, but it gives you a more focused view to work with. And then this is the same data that presented using a flame graph. I, yeah, I don't think I can explain that now, but we can talk about it more if you're curious later. And then finally, this is one of the formator for Hamdell, which is basically, it's similar to user timing API in the sense that it let you mark different phases in your code, but it just has a better API for tracking, especially nasty things and also grouping them nicely. You can look at this per request from Ember data for more details. So let's get to the final layer. So JavaScript engines. So here is a quick origin story for more than JavaScript engine or how they made JavaScript fast. So very briefly, let's look at how C works. I apologize for doing that, but it will be quick. So let's say you have a function in C that takes a point and try to calculate the, I guess a point vector and try to calculate the length of the vector, which is square root x square plus y square, I think. So the thing about C is the compiler can see, oh, like this function is taking a thing called a point and then you can, you're doing p.x and p.y here, but no problem because you know a point is a struct and it has two fields. Therefore, you know if point is located here, then p.x would be located in the first slot and p.y would be located in the second slot and that's how you get access to the parts of the point. Seems good. However, on JavaScript, everything is a dictionary or at least that's a program model, right? So here you can see that there's no type annotation. So when you need to execute this function, you don't know what a p is, right? It could be a string, it could be a number, who knows? And so when you need to access p.x, basically the runtime is like, not sure, like, do you have a thing called x? Can you find it for me? I'll go get it, right? So basically, as you can imagine, implementing everything as a dictionary would be very slow. So at some point, the VA team were like, I believe this is actually an optimization that was borrowed from small talk or something. But basically as you are creating the instance, you can actually track what properties are in those instances and basically create a hidden class for these things or also known as shapes or maps, right? They're basically talking about the same thing. So what you can do is when you call this function, you can basically look at, oh, p, what is the map or what is the hidden class for this type? And oh, okay, it's basically struck with two fields, seems good. And what you can do now is you can say, oh, if this thing is p then you can basically hard code. I would just offset it at the zero position or the first position. What this enables is just in time optimizations or as they call it in friends, legit. What you can do now is you can compile and optimize. This is not a real notation, but like it would do for our purpose. You can compile a special optimized version of get length, hard codes, p to be the points type basically. And so everything else that follows it just follow that assumption. And you can also do more inlining. So basically you can, instead of actually calling math.pow or math.squareroot, you can just copy the implementation and move it into here. The nice thing is because you already know at this point that p is a point then you can also eliminate a lot of the extra checks in math.pow or math.squareroot that it's basically checking the same conditions. So the tools here are native syntax or native syntax Chrome tracing our hydro to. So the easiest way to play with native syntax to explore these kinds of things is to use node probably. So you can launch node with dash dash allow native syntax, native syntax. And you can see that if you create different objects with different shape, you can use the percentage syntax to check what type the engine thinks they are. And this is the Chrome tracing, which is a thing you can do at Chrome, like you can go to Chrome colon slash slash tracing, which actually show you all the low level operations that the browser is doing. And finally, our hydro to is a little bit of dark magic, but it's basically a tool that allow you to look at the compilation that V8 is doing. And then it would also show you some information about the ops, which unfortunately, we don't have time to get to today. Maybe it will show up in a blog post in the future. However, I would like to jump back to the big picture point very quickly. And so unfortunately, we don't have time to go through the ops stuff. Basically, when you combine all the layers together, they could interact in ways that you do not expect. So at the end of the day, even if you're very confident, you still have to rely on very macro NTN benchmarks to know that you, to feel confident that you actually achieve the positive change. So our, the Ember team basically wrote a tool called Ember Bench, which allowed us to script, building custom builds of Ember and custom built of an app and uses a tool that Chris wrote called Chrome Tracing that basically launches from many, many times for you and like go like visit this URL and then record how long it took for the initial render do it like maybe a thousand times and give you error bounds and stuff. So that was how we were able to make a lot of these micro optimizations and when we're working on glimmer and things like that, even though they are like in isolation, very difficult to measure with confidence. So finally, one last thing is, another big picture problem is the 1000 paper cuts problem. Like other times, no one single person is doing a lot of things, but then because you're using different libraries or different teams are responsible for different parts of your UI, you end up doing a lot of repeated work or not a lot of good reason and no single party is responsible for it. And I think this is where a framework like Ember really shine because we have a lot of opportunity to notice that notice these duplicated work and we can do some macro optimization that are not otherwise possible by coordinating things better by eliminating some of these repeated work. I think we're probably not doing as much as we could, but I think we're in plenty of those, but I also think we're in a good position to do more of that in the future. So anyway, that is my very quick tour of JavaScript performance for you and thank you very much.