 We had presentations on performance. Then we had a presentation of the future. And I'm going to talk to you a little bit about the future of performance. Admire the transition. So this presentation is about the JavaScript binary AST. I'm going to give details about what that means. But what is it? It's a proposal for the future of JavaScript. It's a new technology that we are currently developing at Mozilla and Bloomberg and Facebook to speed up loading up web pages in the not so distant future. So it's great. The web is fast. We keep improving the web. We keep making it faster. But of course there is no such thing as fast enough. We have pretty fast JavaScript engines these days. JavaScript is fast enough to run video games, which is a few years ago who would have said that it would be possible. However, sometimes the performance of JavaScript is not sufficient for several reasons. But one of them is look at how much data you're loading when you are opening Google Sheets or Google Docs or Yahoo or LinkedIn or Facebook. All of them just before they can do anything useful load at least three megabytes of compressed JavaScript code. So much more if you decompress it. We have noticed several of them that load something like seven megabytes. And the compressed size can go up to 40 megabytes. Imagine loading 40 megabytes of code in your browser whenever you start your application. You don't do this with your desktop applications, your native applications. And to make things worse, these websites are dated very often. It's usually not the scale of every five minutes, but sometimes it happens. So imagine that is going to wreak havoc on the whole infrastructure of your browser, your caching, all of that. You just need to find a way to cope with this level of JavaScript. Because one of the consequences is it's very slow to start. If you recall what Florent was saying earlier, was it 100 milliseconds, 200 milliseconds that people start thinking it's broken? 500. Okay, so just parsing the JavaScript for Facebook is already at least 500 milliseconds on both Chrome and Firefox. That's parsing. That's once you have received the JavaScript, once you have received the HTML, once you have received the images, the CSS, you start parsing the JavaScript at some point. And after you've finished parsing the JavaScript, you have a few more steps which I'm going to detail soon. And only then can you start having an interactive page. So that's for the chat of Facebook. So this means that for the first few seconds, actually, of Facebook's chat, you cannot do anything. So what's Facebook doing against this? They're doing what we are doing. They're trying to optimize stuff, and they're trying to use tricks to make it look faster. Which means, unfortunately, these tricks, once you have too many of them, it actually makes things go slower, just to make them look faster. And the problem is that things are only getting worse from here. A few years ago, when you were writing JavaScript code, or web application, you were hand-rolling, and you had a few hundred, worst case, a few thousand lines of code. Nowadays, you're using Webpack to pack how many frameworks, how many modules, 40 modules, 70 modules. I mean, insane amounts of data. And there is no reason to stop that, because modules are good, frameworks are good, or at least when they're providing something useful, yes, use them. But the browser needs to be able to cope with that. Unfortunately, bring this a little bit more, and we have exactly the problem that was described earlier. It looks broken. So, I'm saying that loading JavaScript is becoming too slow because of the amount of JavaScript. Let's take a look at why loading JavaScript is too slow. Loading JavaScript is basically the same thing as loading other programming languages. It's a bit different because you're loading it from your browser, but still generally the same outline. You grab the source code, you decompress it, because you typically receive compressed. You typically need to manipulate encodings, then you tokenize, I'll come back to that, then you parse it, then you generate bytecode, then you finally start executing. All of these steps are typically relatively simple in a programming language. Would that that it were so simple for JavaScript? Because JavaScript is actually pretty complicated language. So, let's start with the beginning. You are opening a web page. Let's assume you already loaded the HTML and I think that's going to load your JavaScript. Okay, now you can download the source code, your JavaScript source code. You can decompress it. This is the web, so any encoding is accepted, except you're not going to write an interpreter for every single encoding, so you typically need to convert your encoding to something that your JavaScript virtual machine can understand. That's already going to take some time and allocate some memory for people who are performance sensitive. Then you tokenize. I don't know how many of you are familiar with tokenization, but tokenization is one of the first steps you do when you're trying to interpret or compile a programming language. You take your source code above and you convert it to a series of tokens. So, this source code is function foo something, so the function here is translated to this token, foo is translated to token that says it's an identifier, then we have a left parenthesis, we have another identifier, right parenthesis, et cetera. That's how every single interpreter or compiler on Earth works. Problem is that not all of these interpreters or compilers are doing JavaScript because JavaScript is not such a simple programming language. For instance, if I see foo, what is foo in JavaScript? It's typically a keyword, except sometimes it's not. You cannot have a variable called foo, but you can have a field called foo. So, it doesn't know what token to use here. If it encounters a slash, the slash is a nightmare. What's a slash? Is it a division? Is it the start of a single line comment? Is it the start of a multi-line comment? Or is it a reg exp? Good luck knowing that. The tokenizer has no way of knowing that. Oh, nice one, too. Use strict. If it was any other string, it would be a string, except it's use strict. So, it's actually not exactly a string. It's a directive. And a directive is going to change how the parser works. And there are more complicated things, such as strings themselves are pretty hard to parse or to tokenize because we want to be able to deal intelligently with every single language on Earth, and not all languages use the same bytes to represent the strings. And if we want to save memory, we want to be smart about this. Being smart about it means that we need to analyze every single string. Every time we encounter a string, this includes every time we encounter an identifier, which means things slower. So, the answer to all of these questions is, it depends. In other words, yeah, tokenization is hard. You're going to see that guy again. The next step, once you have tokenized, is to parse. So, that was what we have produced in the previous step. We have something that says it's a function, and then there is an identifier, so a name foo, and more stuff. And then you want to produce something that basically looks like that. This is what the interpreter or the compiler is going to have in memory to do anything, to do security checks, safety checks, security checks, to generate byte code, to optimize your code, to reject the code because features are not available, et cetera. So, from this, it's going to deduce that this is where declaring a function. It's not asynchronous. It's not a generator. Let's not talk about scope. It has a name called foo. It has arguments. So, a single argument called x. And, okay, the body... I don't have enough space to put the body on that slide. Nor, on other sites, be reassured. You will not have to read this data structure too often. Is it easy? I can assure you. I'm pretty sure you already guessed the answer. No, it's not going to be easy. So, let's look at this simple example. Can anyone tell me what's the return of this function is going to be if I put true for this argument? So, I put x equals 20, and I return x plus 10. And the answer is, of course, it depends. K is 1. The one that you probably assumed you have when you looked at the source code because I had hidden the annoying part. Yeah, that's 20. Yeah, 10 plus 10 equals 20. So far, so good. Second case, I've added something in my comments here. I really cleared variable x after having returned it, but who cares? And suddenly, this function returns, oh, that's not a number. Yes, exactly, because of the hoisting. So, we can detail all the reasons for which this happens, but the summary is parsing is not that easy. So, handing variables is one of the many reasons for which parsing is not easy, and parsing is slowed down a lot by these kind of things. If you're having fun with this, handing this in JavaScript, don't worry if the compiler or the interpreter is also having lots of fun trying to understand what the heck is that this? Evil. Evil is even more evil if you know how it works. So, fun fact, there are actually four different definitions of evil in the specifications of the JavaScript language, depending on the context. And some of them change the meaning of variables. Okay, same thing with will. If you have used strict syntax rules, are not the same ones? Fun. And just to make things a little bit slower, by specification, it's not permitted for the parser to skip anything, because if there is a syntax error, you need to know about it immediately. We're talking about performance. Skipping things would have been pretty cool. They would have. Okay, once you have your nice data structure, which is called an AST, we'll come back to that. Well, you still have a few steps before you can execute. You need to perform some safety checks. Then you generate your bytecode. Every single browser has a bytecode format. And then the bytecode is something that you can execute. Yes! Okay, we have spent those 500 or 900 milliseconds. So those were... That was all that was happening during those 500 or 900 milliseconds. We like to make this faster. So we have all these steps. It would be nice if we could move or parallelize or make faster some of them. Let's see how we can do that. People have tried things. Oh, I didn't put the obvious thing. People have tried to optimize the browser. Yes, we've been doing this for the past 20 years. We have done a pretty good job. I mean, we as a community have done a pretty good job. But again, sometimes it's not sufficient. So what can we do to make things faster? We can make lazy parsers. So parsers that are going to try and be smart enough to sometimes skip things. Every single browser actually does that in practice. In practice, it actually is not that useful. Plus, it actually decreases the total performance. So it makes starting a little bit faster and then it makes the total execution a bit slower. We have recently landed in Firefox, and I think Chrome has landed something similar recently, something called bytecode caching, which is extremely great. The second time you load the same JavaScript, you skip most of the steps I mentioned above. That's insanely cool. There are two cases in which it doesn't work. The first case is the first time you connect to Website because you haven't loaded that JS yet. The second thing is five minutes later when Facebook has updated their JavaScript code, and you need to restart the whole process. So great technology. We're trying to optimize the other cases. People on the development, on the JS Dev team, the JS Dev side have tried to optimize things, of course. Typically, with the use of minimizers, that try to make the source code shorter, or with lazy module loaders. So the first one, minimizers are pretty good at reducing the total size of the file that is being sent. And typically, they actually make parsing slower in our experience. So... So useful in some cases, not a solution, unfortunately, plus they also make the code unreadable. Lazy loaders require refactoring your code. Sometimes they work, and generally they actually make your performance worse also. Plus, once you reach a large enough amount of JavaScript code, lazy loaders are going to be insanely complicated to introduce if you have not built your code from start for that. People have also tried to improve browsers by providing new technologies in the browser that developers can use to make faster stuff. So recently, Wasm, which is great, and works for several use cases, many use cases, but not all of them, because you need to basically write your code in C++, or REST if you're lucky. And there are ideas to use the service workers to improve the loading time of your applications, the loading time of your JavaScript. Of course, you're going to spend your time if you're trying to use this for a website that's updated every five minutes, you're going to spend your time downloading updates and compiling your updates and preparing your updates for execution, which is going to be pretty much as good for the planet as Bitcoin. Not the best solution. Oh, and people are, of course, doing native apps because sometimes, if you want to make things start faster, sometimes the native app is the solution. Our objective here is to provide something that doesn't need you to refactor, doesn't need you to use a native app, and just works. So it's called the JavaScript binary AST. It's a proposal for the JavaScript language. Basically, it's a new file format. Oh, yes, we've been working on this. It's Mozilla, Bloomberg, and Facebook. So it's a new file format. Instead of .js files, you're going to send .binjs files, which are much faster to parse, which are pretty much equivalent. They're not ugly-fied. It's not a bytecode. It's not a new version of WebAssembly. It's not a competition of WebAssembly. It's your usual JavaScript just in a different format. So if you recall, I told you, you would not have to read it many times, but I never said that you would never read it again. So this is our function declaration. So this data structure is what we call an AST, an abstract syntax tree. So it's a tree. It represents the syntax, and it's kind of abstract. We can store it efficiently. We can store it just as this sequence of numbers plus the definition of the string foo above. I skipped a few things. Plus we can compress it afterwards. The whole point of that, well, there are several points. Remember the long lists we had earlier? Well, we can make it shorter. Still need to download, but then we can tokenize parse and check things in a much more efficient way. If we have a good binary format, we can actually skip things and with a few changes to the specification, but if you have the right specification, we can skip things and only read what we need, only tokenize what we need, only parse what we need, only check what we need. Then we generate the bytecode and we execute. Many things are faster. Hopefully the file is smaller. We haven't reached that state yet. We can start all the operations much faster. We don't need to perform as many operations before we start. Tokenization, which, you know, had to guess whether foo was a keyword or an identifier or a name. It becomes absolutely trivial. And the format is what we call the proof carrying format, which basically means that checking that things are basically safe is much, much, much faster than it used to be. Again, we only parse and check and tokenize the code we execute, which makes it much faster for startup. And we do this without loss of performance by a position to what happened before. We also need to parse all names, all strings, all live variables, et cetera, only once. And we're pretty sure that we can parallelize this much better for modern architectures. So I believe we have a little bit of time for a demo. Thank you. So let's... I've taken the source code of jQuery. So this is jQuery. You don't care. You don't need to read it. It's just jQuery. We compress it above. I don't know if you can read the size, but we divided the size by about six more or less. We can do better. But that's what we have for the moment. Then we can decompress it back, which is actually a bit slower than it should be, but we've not optimized that part yet. And this is the result, which should be more readable in my editor. So things to remember is, okay, we have lost comments. And that's basically the only thing we have lost. And so variable names are here. Layout is a bit different, but it's still understandable. And yeah. So we have a process that we can reverse. So it's a compression format. And that process, that version, the binary version, is much, much faster to load. Last year, we wrote a full proof of concept, full minus security. We had not implemented security for that test. It was just a proof of concept just running on my computer. I didn't care about security because I was the only one using that format anyway. I'm not going to give hard numbers on the speedup, but we had insane speedups with that version. And the file format was really much smaller than minimized plus gzip plus. So we put that into the standardization track. And right now, we are currently writing a third prototype which has the security. The security, again, is easier and faster to check with JavaScript. The source code of this parser is easier to check than the existing parser that is part of Firefox or any competitor. So it's actually harder to make security errors in the parser. It's never impossible, but it's harder. Okay, we're not finished yet. But anyway, it's on the process to be finished and standardized. Our hope is that we'll be able to ship a version for opt-in testers during this summer and try it with a number of large vendors if you want to help. And thank you for listening. So the question is, I mentioned that during parsing we can skip the parts that we're not using yet. And the question is do we need to parse them later? Yeah, of course, the answer is yes, we parse them later, but we only need to reparse the part that we're using. We can parse one function at a time. We can say, hey, let's start at this offset until that offset and just parse that part, which is something that's not possible with text JavaScript. You can try to scream if you want in case somebody hears. Okay, does anyone have a question who's closer? Actually, I went a bit over time, so if you have questions, do not hesitate to ask them afterwards. Thank you for listening.