 Hi my name is Kenny Nguyen. Today we're going to be talking about the testing gRPC services in Node.js. So who am I before we get started? So before we journey into this presentation, I just want to establish what drove us, meaning the team, to create gRPC. It's a tool conceived through pragmatic exploration and analysis of the gRPC protocol. We used it as our sandbox for studying and probing the nuances of gRPC. So while we gained knowledge through this journey, the overarching objective was to explore and understand the challenges that other devs may face surrounding gRPC and Node. So this is the team that created it, Miri's on the left, me, Patrick, Johnny. Miri's actually right there, hiding in the back. And then this is the new gRPC mascot. That's my dog, Charlotte. So today's agenda is why load test your servers, and then I'll introduce gRPC, the load testing tool that we created. Then I'll talk about some insights into development, followed by a quick demo. After the demo, I'll talk about some performance metrics. And then at the end, we'll do a Q&A. So why load testing? So load testing helps devs identify issues such as system lag, slow-page load times or crashes when different levels of traffic are accessing the app during production rather than post-launch. I can help you identify bottlenecks in your system. Knowing how your system performs under various levels of load can help you identify scalability services effectively. Also, if you had made any promises in your service level agreements, these load tests can help you meet those guarantees as well. So what is gRPC? Like I said, it's a load testing tool. There are various aspects of performance testing, such as stress testing, network modeling, and so on. This tool just focuses on load testing. And some key metrics we'll be looking at are CPU usage and latency. So gRPC does more than just a camera server of what requests. It also gives you a window into what's happening to your system with some visuals and charts to make things a little bit more digestible. So the first insight, the first challenge we ran to was how do we dynamically generate or obtain this client stub? There's a bunch of options to do that, just to name a few as a cogeneration, reflection, or giving your proto file path. So for cogeneration, we just ask that the user generate client server code from the proto files and share that with us. And then that could be like really that you could get like the pros of that is like you could get compiled time checks, which can catch errors early. The cons, it just adds an extra step. You could lead to more like reasoning challenges. The second is reflection. The server would have to have the reflection on it, which is not always the case. And then last method, which is what we went with, is a proto file path. It's pretty simple. You just put your file path in a config file for us, and then we just use that to create the client stub. So why is this nice? It's super simple. It's easier to manage different versions of your proto files. You can have dynamic updates, switching proto files without recompiling the tool. So this is like an example of a simple config. The first one, there's a couple of things here. The first one being duration. So just like the time defined of the load test. And then the path to your proto file, your service name, the package name, method name, is just pulled straight from the official GitHub repo for GRPC. So this might look familiar. And then these, it's just like the options, the flags that are in the CLI tool. If you don't define it in the YAML file, but if you do define a YAML file, it's a little bit less verbose when you do the CLI tool. So the second insight we had was observability. Like the previous presentation just said, there's like three pillars that observability stands on, right? The first one being logs. These are just like the detailed descriptions of what the system is doing in an environment. So in GRPC, these would be things like client server communication, request response payloads, any errors or exceptions. Metrics are going to be the quantitative data that you give. And then you can aggregate what you get a sense of how your system is performing. Traces, this lets you track your quest journey through various services and systems. And then to solve this observability, we just use GRPC interceptors. And then so obviously there's a lot of observability and monitoring tools. Like why would we not just use Prometheus or like Grafana and all these other tools? It's not because they're not great. It's because what they are. But it's kind of like an overkill for something that could be simple or have a really simple process. You just want to do a simple load test. You don't really need to hold on Prometheus and then create a Grafana dashboard just for that. And then the third and final insight that we had was concurrency. Given that Node has a single better event loop, we were just trying to figure out how to load test our server and how to simulate multiple concurrent requests. Well, it's not brand new, but recently Node has released worker threads that you can access. So we used that. So we just spun up a bunch of clusters and each cluster would spin up worker threads and we would use that to simulate a load test to our server. And then so what we did was for data collection was each worker thread would use the inter thread communication and pass messages between each other and then back to the main thread to aggregate all that data. So these would be things like like CPU usage, like I said, latency and throughput. Another interesting metric that we used was event loop utilization. Trevor Norris actually talks about this and can explain it way better than me. But to summarize his article, he said CPU is no longer enough of a measurement to scale applications. There are other factors such as garbage collection, crypto and other tasks placed in LibUV's thread pool and that can increase CPU usage in a way that is not indicative of the app's overall health. So the event loop is just like basically a timer for when the event loop is idle or when it's active and just like a ratio of that. And the nice thing about that is you can use that per thread because it's a thread safe method. So you don't have to worry about any memory leaks with that and whatnot. So and because each worker has his own VA instance and event loop, so you can track each instance's event loop utilization like ratio, right? All right, time for a demo. All right, so I'm already in my like file path, right? So I'm just gonna start my server, super simple command npm run gipc server. It's gonna start on 50,051. And then I have a script ready to run my load test. So it's gonna ask, this is a gipc load balance tester. It's gonna ask you how many clusters you want based on the amount of cores in your computer. Mine has 10. So I'll just put 10. How many worker threads per cluster recommended one because you don't want to overload the core or else it's not really optimal. And in this part, you can put how many you want, like I put like 500. So it would just like load test that server. So there's like 5,000 calls to that server, like, and then it'll create an HTML file for you. So I don't want to like show I'm like, wow, it creates it right here. I call it dash HTML, you can call it whatever you want. And then so this is kind of hard to read. But it's pretty basic stuff like your CPU usage, obviously it's going to be really high. We just use like all the cores in my computer to load test. But what we're looking for is the the screen part right here, especially like the it's the y axis on the right over here is the percentage that the event loop is utilized over that time. So naturally, if you have like, you want to test like your service and your CPU usage is like high, but your event loop utilization ratio is not necessarily that high, that doesn't mean that you need to scale up necessarily because your event loop is probably blocked or not being optimized. You can't see it right here because I don't have this this data doesn't show that, but that's kind of what that whole idea is. Let me see, let's go back to the Well, yeah, that's pretty much pretty much it. Oh, so what's next? That was only unary testing and we're trying to make the other streams as for testing as well. Always open for feedback and contributions, of course. And does anyone have any questions? The metrics that you showed are those showing the utilization on the client that's performing the test or on the server that's receiving the load? Yeah, that's a good question. So I was actually like trying to figure that part out because it is all around on my computer. So I'm not completely sure because I believe this on the server the way I set it up. I have to look into that more because I have exactly metrics being recorded on both sides actually. If they were separate machines, what would be? You'd be a server side, yeah. I had a question. Could you use this for really big stress testing? Have you thought of that? Like utilize 10 machines that would all connect or that's outside of your? I've thought about it, but I haven't tried it. Yeah. This is for NPM, right? Yeah, so any plant or any, can you maybe add tools for similar tools for Golang? Sorry, what was that? Any similar tool or any, do you plan to support Golang in future? Go? Oh, I mean, it should be like language agnostic, right? It should be able to test other servers and other languages. I haven't tried that yet, to be honest. But yeah, I'm definitely going to, on the roadmap.