 Hey everybody, my name is Omer. And today I'm going to talk to you about our journey adding support for serverless apps in our debugger agent. But first, a little bit about me. I am a software engineer at Thrucout. I have a background in developing low-level applications in C on the Linux-enabled IoT environments. One day, I hope to stay to Southern French in Duolingo for a full month. But let's look at the head of ourselves. Let me start with the disclaimer first. This talk is going to be about the journey that we had as a company adding support for running in a serverless environment in our Node.js debugger. This is not, and I repeat, this is not a fully comprehensive guide or to-do list that will guide you through how to make the transition for application yourself. However, I do hope that understanding different pain points that we suffer from will help you to better identify and solve your pain points as well. So let's get to it. How do you do it? From the perspective of package, there is not a lot to be done. Your users just add you as dependency for your serverless handler and call you at the start of the endpoint. So let's look at a simple example. Let's say your package is called Logic. So previously, your users called Logic.start at the start of their main function. And to make the transition to a serverless app, all the only thing your clients had to do was call Logic.start at the start of their endpoint handler. Pretty simple, right? You also might want to consider publishing some helper functions to make the transition easier and make it easier to integrate your application in serverless environments. In our case, it was by publishing a helper function that wraps your endpoint and runs our debugger agent. After that, you can pretty much enjoy a new life by supporting serverless. And what went to our head at the time was easy peasy, lemon squeezy. But let's be realistic for a second here. It isn't going to be that simple, right? Otherwise, we won't be having this talk at all. So a few days later, we are getting a notice from a client. I can debug my application. For us, a debugger, this is really bad. It pretty much means we are not even fulfilling our most basic requirement. And that's not good. Our immediate reaction to that went something like this. So after our initial shock, we went on to investigate what was the problem. And it turned out that it takes about 20 seconds to start our agent. For most of our clients, that's fine. It usually happens only once at the first initialization of the app. When there is usually less traffic anyway, and it is not something that they would notice. However, for serverless apps, that same first initialization happens way more often. So let's go over the issues that users most commonly face with serverless apps. You've got cold starts. You've got a transition in your functions to be stateless. It's very easy to fall to vendor locking when first migrating. And you're also limited by the size of your package. So these are the big pain points for most application. We are not going to go over every single one of them on this talk, since we don't have the time for that. But we will talk about the major pain points that we faced when transitioning our debugger. It is also worth noticing that not all pain points are equal and some are less difficult than others and some harder. For example, if you decide to use a Docker image as your Lambda endpoint, you can increase the size of your app from 200 megabytes to 10 gigabytes simply by using Docker image instead. Our biggest challenge was the challenge of having big cold start times. To those of you not familiar with them, cold start. It is how long, how many time it takes to initialize a new serverless worker with your app until it is ready to handle an incoming request. That includes the time to set up the worker itself, set up your runtime, and also set up the different modules to compromise your application. And that is the thing that hurt us the most. So obviously, our first step is to understand what takes us so long to boot up. And we did it by benchmarking our application. Well, we found our first culprit and it was quite obvious for the debugger for Node.js. We take a lot of time parsing source maps. So before we begin our journey of trying to better improve our agent to support the serverless architecture, we need to note a few things. The main goal of this talk is mostly to guide the newer developers on how the different concepts and techniques programming use should be utilized to support the serverless architecture. If you're a more senior developer, you might find some concepts in this talk as obvious. But I hope you'll still learn something new. So before we actually begin our journey, let's briefly, very briefly, go over source maps themselves. What are they exactly? So in a nutshell, JavaScript world is a mess. You need to support many different runtime environments. You've got both Node.js and you've got browsers. And even when running on browsers, you need to support all different kinds of browsers where most of them are not compatible with each other. For example, if anyone here remembers indexplorer6, good old days. Also, some are very old and outdated like indexplorer6 and the question remains, how do you support all these environments? Well, that's why we have transpiders. The role of a transpiler is taking our source code which may or may not be written in JavaScript at all and converting it to JavaScript that is supported by almost all browsers and run environments. Let's look at a simple example. Let's say I have my source code written in TypeScript with type annotations and all the different aspects of TypeScript code. And I want to run it on a Node.js server. So Node.js knows how to run JavaScript code, vanilla JavaScript code. So the job that's transpiler is to take our source file and convert it to vanilla JavaScript code that is able to run on our server. It is worth noting that the final source code itself is usually a lot harder to understand as a human since a lot of different hacks are inserted to your code to make the support for different environments possible. So this raises a question. How do we actually debug our source code? Since we have actually two source files, the one we wrote and the one that is actually being run on the server. So how do you debug it? We know how our code looks but it isn't what has been finally run on the server. So the solution to our problem is source maps. They allow us to debug the source code on the server using the view we have of our source code that we wrote. It contains a mapping from every position in the final source code back to positions in our original source code. Let's take a look at a simple example. Over here, we can see an example source maps file. I am not going to go over each and every field here. Let's take a look at several of them. We've got a versions field which tells us which source maps version we are using. In this case, three. We've got the different sources that compromise the file. We've got the names of different elements but the most important one here is the mappings field. Now, I know for most of you guys, it looks like absolute rubbish but it is actually a compact encoding of the mapping of every position in the final source code. The vanilla JavaScript code that was outputted by the transpiler and back to our original source code. In our example, it was our TypeScript code. When you actually pass this encoding, you can get the different line numbers and column numbers matching each position in the code. After all our talk about source maps, we've got enough background to actually tackle our problem. Let's get back to our original problem. We, as a debugger, must load your source maps in order to debug your application. The way that we do it is by parsing your source files. When you compile your code with source maps, it adds a comment containing a URL to the source map itself. We, the debugger, then read the source map and parse it, allowing you to debug. But when do we actually load the source maps? Well, the answer is quite simple and obvious. When a new source file is loaded, but that just leads to a different question. How do we know when a source file is loaded? And the answer, once again, is simple. V8 tells us. So what is V8 exactly? V8 is the engine that runs JavaScript code for Node.js. It is an event loop engine, and it is very fast and great. Our main communication with V8 is by subscribing to an event and passing a callback that will handle that event. For loaded scripts, we use the debugger.script.post event to get notified about loaded scripts. Pretty simple, right? When we first started out a couple years ago, when a script was loaded, we went over its source code, we looked for URLs that are pointed to related source maps. Then we loaded each source map URL and parsed it. That is a lot of work, both for loading the source file and for looking for the URLs of different source maps. Think how much I owe work we are doing here loading the source files only to look for the source map URLs. Source files can be huge, and we are spending a lot of time going over each line in them and looking for URLs. Short while later, we found out that one of the parameters to the callback of script parsed is actually source maps URL. That actually says us from loading the source file and looking for URLs of source maps. We saw massive, and I mean massive, performance improvements using this instead. And we learned a simple lesson here. Read the docs. When we initially wrote the code for that function, we tried running a POC and it didn't bother diving too deep into every bit of V8, which is a huge project with tons of documentation. But for us, reading the doc and diving deeper into our environment was an investment well spent that you should definitely make when you are trying to optimize your code. Let's continue with our search for the bugs. I get it lazy. Although we got a nice performance boost, it is still not enough. It is time to look at the next corporate. And it is every programmer's worst nightmare, working too hard. At first, when we only supported normal server apps, we loaded each script at the start of the program. And that means we also passed the source maps for each script at the start of the program. Think, for example, about the huge app with tons of different source files. That's a lot of source maps to pass. But that was a trade-off we did at the time between having one big spike of latency at the start of the app or having smaller, more spread-out latencies, but was that we couldn't be entirely predict? For example, how do you know exactly just from reading your source code? When will that file be loaded? It is a lot of mental overhead that you probably won't do. At a time, we thought it was better to have one big spike and having a much smoother running time for the rest of the app lifetime. However, when you're running a serverless app where it is entirely possible not all source files might not be needed at all for a request, the trade-off, it wasn't worth it for us. And so we decided to change how load scripts. Thinking back to our example, if our endpoint has different code paths, depending on the parameters of the request, what if one of the code paths only needs to load like five source files instead of like hundreds of them? We don't need to do that big amount of fork just for five files. Now we only do the necessary amount of fork and not more than that. This is yet another massive improvement to the loading time of the debugger. The lesson learned here is to identify and extract the pain points in the environment you are in. In the case of a normal app, we could compromise on startup times in exchange for lower latency later on. But that wasn't the case for us on serverless applications. There, we much better have smaller latencies that customers probably won't even notice but allow them to shrink their call start times by a whole lot. Well, we are at our final point now. So let's dive in. So we are seeing massive improvements in call start times, but we can push even further than that. We believe that send source maps parsing is our biggest time consumer. We're gonna still improve there. I mean, it makes a lot of sense. We are a debugger after all. Source map parsing is most of our job. In the end, the parsing part is done by an external dependency. And since most of our loading code was already optimized, we decided we need to take a dive deeper into the source maps parsing library as well. Initially, we thought about how we can optimize the original library we used. It was a different source maps library at that time. And if you can contribute to it, it would be awesome. I'm pretty sure I don't need to explain to you the importance of contributing to open source projects. But also after some investigations, we stumbled upon a different library called trace mapping. The library itself was written by Justin Ridgewell. Which I don't know if you guys know or not. Wrote another hugely famous JavaScript library and tool called Babel. And if you see right here, it even came with benchmarks. So let's take a look at the graph here. We can see a comparison between trace mapping, which is the library that Justin wrote, which does about 32 million ops per second versus the default source map parsing library at the time, which added best does 23 million operations per second. So as I'm sure you can see, we've got a big boost in performance just by switching. For us, it was an easy decision to make. We just need to put our code base to this new library. Thanks to that, we get even more performance improvements to our debugger and more importantly, even more performance improvements to our customers that load us at the start of their handler. But you probably already saw this coming. The main idea here is applicable not only to serverless apps. I'm going to look at something entirely different here to show you that it isn't something that is JavaScript specific or maybe serverless specific because I think this idea itself is very easy to implement in your code and might result in huge performance gains, mostly in your serverless apps, but sure in every other area that you might stumble upon. So let's dive into it, hash maps in Rust, something completely different. Let's take a look at the default hash map invitation. It is good for most of your use cases, but the developers over at the Rust Foundation team had a decision to make between denial of resistance to denial of service attacks and performance and they chose the resistance to denial of service attacks. It suits almost all of the different use cases. Mostly people can do without a little bit of performance in exchange for resistance to attacks and they are okay with that, but that doesn't mean you need to make that decision yourself. If you know you need speed, just change the defaults. It's okay. Let's look at this simple, very simple source code in Rust. So over on the first line, we can see how we can create a new hash map using the default hash show by calling hash map, column, column U. On the second and third line, we can see how to choose the hash shell that will be used for your hash map. Just as a side note, in this example we are using the different hash map as well in the second and third line, the default being random state, but you can hop on to crates.io and choose a different hash shell for your code as well. So over here we can see we called random state, column, column U, and then initialized our hash map with hash map, column, column, with hash shell, with our hash shell. So I think that the lesson learned here is to not be afraid to tweak the defaults to something that better suits your needs. At the time, the defaults was using the source map JS library, but we shouldn't be afraid to tweak it to use a different library that might not be the standard at the time, as a side note, it is now, but at the time it wasn't. And we should definitely use it instead. So let us conclude. We started with up to 20 seconds of call start times in edge cases. And we ended up with around 300 milliseconds. That's a huge boost for huge lambdas. As a side note, most lambda functions are small, so it takes way less. What did you learn in our journey? There is a lot more to consider when developing for service architecture versus a standard application. Let's look back at the previous slide. We went over our different pain points, call start times, package size limits, vendor locking, and all sorts of stuff like that. Also, call start can help a lot. And it is worth noting, you should always keep an eye out for optimization options. They are always out there, and sometimes they are hard to do, but sometimes they are just waiting for you to do them. And that's it for me. I hope you had a great time joining us on our journey and that you have learned some skills and ideas that you can utilize when doing the transition yourself.