 Hello and welcome, everybody. So the topic of this talk originally started with the long-winded title, where it was something about leveraging wasm to improve quality. And then as it was building out the presentation, it became clear that we were actually super-charging our APIs and decided to go with the punchier title. So you can take a pic for whichever title you want for this session. So my name is Rohan Deshpande. I'm a managing director at the senior engineer at Goldman Sachs, been at Goldman for three years. And some of the areas that me and my teams we run are on platforms. So we own an operator mobile platform that's used by multiple mobile applications at Goldman Sachs. Our focus is really on developer productivity. So anything to do with improving engineering efficiency, being able to write code faster, being able to reduce your build cycles, all part of the charter of my teams. And along with that, also playing around with generative AI, like I think everyone else is. And I think it's a very promising technology. And also want to operate an API platform for Goldman Sachs. And that is going to be, I think, the focus of this talk and the crux of the discussion here. So the Goldman Sachs' API platform has been around for quite a few years, actually. And it's actually the foundation of Goldman being an API-first organization. So anytime someone's building a service at the firm, they think in terms of what would it look like if you put an API in front of it and they start with an API construct. And generally, we offer these services either externally, either internally. If they're available externally, they're available through our developer portal at developer.js.com. So if you wanted to leverage functionality like doing money transfers or getting various data feeds that are financial-related, you can take advantage of external offerings, external services, integrate with them, and you can build your own application on top of it. So my team provides the infrastructure that powers these API-based services at Goldman Sachs. And we serve millions of requests per day. So it's pretty significant in scale. And just in terms of reliability and requirements for each of these APIs, they have different requirements. So it's a multi-tenant, fairly complex piece of infrastructure. API platform has been around for quite a few years. So we have the dreaded legacy that we have to support. So we support multiple runtimes as part of this gateway infrastructure. And if you look at a runtime, typically, when an API request comes in, it's a fairly, if you just step all the way back to the 10,000-foot level, it's a fairly sequential flow. You have a request coming in. It'll go through a series of steps. You can call them plugins or filters or whatever it might be. That's the logic that you would use to perform some action on that request. Goes to your actual business logic, does some work over there, sends back a response, and will follow a similar path back out to your caller. So these steps, filters, plugins, we actually have three different categories of plugins that we have to support here. The first one are what the vendors provide. So if you're running the Envoys or the Engine Xs, you get a bunch of plugins out of the box, which you can use to extend those engines. Generally, the vendor or the project itself will provide a bunch of them. They all do all the testing. You can validate them as you need to. But generally, they come back as a package in a box. They could be implemented using C++ or a scripting language like Lua or something similar. I think the more interesting is the custom plugins that we own and operate. So we have two categories there, too. There's a bunch of plugins that my teams build and operate on behalf of the firm. This could be something that's shared functionality that's applicable to everyone. For example, let's say you wanna have open ID, or you might wanna do some kind of metering on your APIs. That would be a common plugin that my team would own and operate and provide. But then we also have plugins that are customer teams, internal teams that use our platform, we call them customers, and so they would build their own plugins as well, primarily because they might have functionality that's unique to just their use case. And they might wanna implement something and it doesn't make sense to make it into a whole firm by offering. So it becomes something that the team would own and operate and then they would plug into this multi-tendent system that we have. Or the other option is they might choose, they might not wanna build it, but they'll come to my team and be like, hey, we need to build this plugin for us. And then we'll go back to be like, you know what, we're so backlogged, it's gonna be 18 months before we're gonna take a look at our request. And that doesn't scale, because everything needs to be done at the speed of business, aka yesterday. So in that case, what we would love to do is enable each of these teams to build this functionality themselves and provide it as a plugin that we can then run within this shared infrastructure. So the challenge with building these plugins, especially as they're running in this environment, first one is the choice of language, right? Typically, like I said, these runtimes, they're the support like C++ or scripting language like Lua, but that's not really the most common skill set. Like, you know, generally developers, they're more familiar with higher level languages like Go or Java or JavaScript TypeScript. And they would prefer to write these plugins using, they would prefer to write code using any one of these languages. We have a fairly mature SDLC, a software development lifecycle. So we wanna be able to take advantage of our investments in CICD in best practices like code reviews and IP scans and dependency management. These are things that are getting more and more important as you, as secure SDLCs become a thing. And so ideally you wanna go with these mainstream languages because they have support, they're used by thousands of engineers at the firm and they have support from a tooling perspective across the board as well. Running in a multi-tenant environment, you want security, you want some kind of isolation both from a runtime perspective as well as from an operational perspective because you don't want one plugin causing issues to every other plugin running in that flow. And then from a compilation and build system perspective, you wanna leverage everything that we already have. And finally, a bonus would be if you could reuse code because we have multiple engines and instead of writing the same business logic multiple times that is specific to the engine, ideally we could write it once as a library and then integrate it into each of these plugins. So we had a bunch of requirements. We went around looking for was there something that could help us solve these use cases and very quickly we came across WebAssembly. I think primarily it started because on why it was supporting WebAssembly. And so we started looking at it, we looked at some of the technologies, we decided that WebAssembly seemed the most promising of all the options out there. One is the binary format gives the efficient execution semantics. It runs in a sandbox. It's portable so whether you're running on different flavors of Linux, you're still using the same WebAssembly module. You can use multiple languages that can target compilation to WebAssembly. And then it's a very extensible system as well. So a lot of these, there's quite a few other benefits but these were the ones that stood out to us as hey, you know what, let's take a look at WebAssembly. And so when we were like, hey, let's take a look at WebAssembly, how do we know it actually works, right? Because there's a bunch of material out there, WebAssembly, the ecosystem is moving pretty quickly. Is this something that we could actually leverage and end up with a fairly safe but also reliable plugin model for the API gateway system that we have. So we decided to do a proof of concept. We had an existing plugin that was doing account number reduction. Account numbers are fairly standard, well-defined structures. It's a nine character format, nine character string format. And so we decided to do a little bit of an AB test and also try into a proof of concept where we could build a plugin using WebAssembly, plug it into our runtimes and see how it worked from a reduction perspective. So the requirements were pretty straightforward. Identify the nine character account number coming back in the response and then redact the first five characters with asterisks. So just get back to the last four. And then we wanted to try a few more things. We were like, hey, let's try and implement the season go as the language. And can we try and get to at least 70% test coverage? Because the test coverage was, I think abysmal is giving it too much credit for these plugins probably in the 10, 15% range. And of course, that never really covered like all the use cases you would like this. So we were trying to get to like a much higher number here. And they wanted to take advantage of all the build and deploy tooling that we had and all the investments we're making from a SDLC and CSED perspective. As a bonus, we wanted to try if we could share this code across multiple runtimes. The multiple engines, but then really we were trying to see if we could share the code between on-one and nginx. And how much of that code could we share? So to get started, if you build a plugin, where does it fit in that typical API architecture? You can see the redaction plugin would sit on the response side. We have fairly simple plugin. It takes in some input and sends you back and continues the JSON response back with the redactor to count number. So in order to build this, the development workflow would basically be, hey, let's write some code, let's compile it into a WebAssembly module and then let's integrate it into each of these runtimes. From a code perspective, the code is actually fairly straightforward goal. I'm not sure if you can see it, but there's a masking function here. It takes a string input, blah, blah, blah. Nothing magic over there. And then there's a response handler for the response itself. You can see we're actually using proxy wasm. The proxy wasm go SDK in order to interface with the proxy itself to grab the request and then manipulate it and then swap the response on the stream. But again, if you look at the code, there's nothing unusual about it. It just looks just like any other goal code you would typically write, which is very different than if you were using a scripting language or something in order to implement these plugins. And then the benefit, the nice thing that we got along with it was the ability to write unit tests, again, using these high level languages. So for Go, for example, we were able to write fairly comprehensive unit tests. And in fact, we got to about 80 plus percent test coverage just out of the box, writing a whole bunch of not just positive tests, but also a bunch of negative tests here to ensure that the plugin was working as expected. And we were handling a bunch of edge cases here. So once we had these unit tests, we had the code in the unit tests. Next up was let's compile this to WebAssembly, right? And so the question was, how do we do it? Like you can use a bunch of standard compilers, but the one we ended up going with was using TinyGo. So TinyGo is a compiler which has WebAssembly as a target. And it creates small efficient modules. It's actually intended for embedded systems, but we were able to leverage it and use it to build these plugins that we could then run on the server itself as part of our API pipeline. Again, the command itself, it took a little bit of figuring things out, but again, it's all documented, so there wasn't much there. There were a couple of situations around it too for some of the other plugins as we were moving forward around like garbage collection specifically. So there might be a few other parameters you might need to tweak depending on the capability of building here. By the way, again, it's pretty straightforward. Like take TinyGo, for example, and you can take a go code and turn it into a WebAssembly module. Once we had the module, next step was how do we integrate into the runtimes? This is actually, I think, the most difficult part because once we have a WebAssembly module, trying to integrate it into an existing web engine like Envoy or Nginx, there is documentation, but the documentation and the systems are moving so fast that it did take a little bit of research here. But ultimately it came down to an integration file, right? You know, very simply put, it's a configuration that you figure out, you deploy as part of your package along with the WebAssembly module and then boom, you're ready to go, you have your plugins. So once we had that, then we were able to leverage a CI-CD pipeline as well. Pipeline has, amongst the many other steps, I think these are the most important ones that we were able to take advantage of. We could do dependency scanning on the plugin itself to ensure we were using the correct dependencies, nothing that had a CVE or something associated with it. We were able to run a unit test and then ensure that we had code coverage. So we set a threshold of 78%, which was greater than zero, which we had, which was the previous threshold. So we were able to get proper code coverage here and actually treat this as a proper software artifact. We run IP scans as part of all our code that gets checked in and built. And then the script you saw basically formed the core of this compilation step where it took the go code, compiled it into WebAssembly module, and then the follow up to that is we build it into an image, and it gets uploaded to our container repository and then consumers can pull from the container repository and then deploy these runtimes as they want to. So again, the point is what we ended up with was taking advantage of all our investments in our CI, CD, and ISDLC by using a high level language here. One little issue we ran into with the compilation step turned out it was much slower than just standard go code. We haven't figured that out yet, but then what we ended up doing was just sharding the compilation steps. So for every five plugins, we would do the build process and a separate host as part of the compilation step, just paralyzing it. So we ended up with getting similar total end to end times on our pipeline here. So as part of this process, building this one off plugin as a proof of concept, lessons we learned. Yes, we are able to develop using a familiar well supported language, like Go, which has proper support within our ecosystem as well as outside. With this language, we were able to take advantage of all our form-wide tooling. You know, anything about code scanning, package management, dependency management, containers, whatever we had, we were just able to plug right into it and build out a proper get-we-solution that had the right plugins baked in, which were built with custom code. We actually achieved 83% code coverage, and again, this is straightforward. We can probably go higher here, but at this point, I think about 83%, we started seeing diminishing returns. But generally, again, this is no different than any kind of Go code, right? So you could take advantage of it and go as high as you want it to over here. One nice benefit was, again, out of the box, we were able to share about 35% of the code we wrote between the various runtimes that we had, and this is again with zero effort, with minimal design changes, just taking the core logic, that masking function logic itself, along with some other decoration rounded for error-handling and whatnot, that just became the core of the plugin itself, and we were able to use this across the board. I'm pretty sure we can actually get to higher than 50% with a little more investment in structuring the code itself, but that's a challenge for another day. The downside, so there are lots of positives, the downside is there was some performance impact. What we saw was about 14% give or take reduction in throughput per host with this gateway. Again, we were expecting some reduction in throughput, just given how the WebAssembly, the runtime itself operates within these engines. 14% was actually well within the bounds of what we were expecting. So again, this was just out of the box with basically no performance tuning, no tweaking, anything at all. Because we do handle hundreds of millions of requests a day, it is in our interest to try and reduce that hit as much as possible. So we'll keep investing in the performance side of things and try and reduce that hit to single digits, at least if we can. So we'll be able to prove here, is like, hey, you can write a plugin and you can actually use WebAssembly and it works almost as well as a plugin written with C++ or Lua that's running within one of these runtimes. And so lots of very positive benefits here. So outcomes, right? So we do have a ton of plugins that either first party we've implemented them or different teams have implemented them. So because of the experiments we ran, by the way, that was about a nine month process from start to finish to get to the point where we were like, hey, this evaluation works, like this is great, we should actually proceed with it. So we had a decision to proceed, we've been converting our plugins into this WebAssembly based model, leveraging Go as a language of choice implementation. So we've converted 55% of our plugins so far. We will be converting them as we, through the rest of the year. Our first plugin that's based on WebAssembly is actually going into production in Q4. So we're obviously being cautious and careful because it's new technology to run in production. But we're looking to do like a blue green rollout or a little bit of like an AB rollout where we do, we'll have like requests going through different pipelines and then we can have traffic going through one versus the other, we can do a comparison and as we start scaling up traffic going through the WebAssembly based system. So we'll actually looking forward, I think we'll see a lot of benefits overall. Now, if you're planning to do this, if you're planning to go to WebAssembly route here, a few things to keep in mind. The biggest debate we had was around trading runtime performance versus development convenience. So as a developer, yes, you can write using Go or whatever language you might choose, but downside is, are you willing to accept that performance here? And I think that's the biggest debate in the trade-off you need to go into. You need to make that decision very consciously. If you're trying to reuse code across different runtimes, it's actually fairly essential to carefully design that core of your plugin versus the wrapper code, the shim code that you need to write in order to integrate with any of these runtimes. As we've been converting these plugins, we've been learning more and more lessons about the right design, whatever design principles we might need to follow, what goes inside the body, what stays out. Keeping the context clear and then converting the context into something that's a reusable entity, those are some of the patterns that we're learning and figuring out. And as we learn more, we'll probably start publishing them on a developer blog as well. One of the learning, definitely invest in integration tests. Unit tests are great. Integration tests are even better because you can actually test the end-to-end functionality. And that will help you figure out edge cases, especially the integration cases with the runtimes. And there'll be quite a few of them because things like memory usage or any kind of ABI incompatibilities that you might run into. Those are things you might discover at runtime, not at build time. So it's actually essential to write a good suit of integration tests and run them consistently. And then the last thing is the ecosystem is very active. WebAssembly, the spec is still evolving, the implementations are evolving, the tools are evolving. And so it's a very active ecosystem and it's really important to stay on top of all these changes. Something that works today may not work two weeks from now. And so staying on top of things, having a big advantage of CI CD and continuous builds and the tooling around it actually becomes super, super essential. And so I guess automation is the key here. Okay, so that I will wrap up. I think we've been at the, we've had a very fun journey. We've been at it since the start of the year. We're seeing a lot of benefits and we're looking forward to rolling out WebAssembly in production. And feel free to check out the developer site, the GS developer site, check out our blog. And I'm happy to take some questions if there's anything. Yeah, so just repeat your questions. The question is how are we doing telemetry from within the plugins overall? Yeah, so we have, so depending on the environment we run in, like we do track things like, you know, 400s, 500s, latencies, those are the three key metrics we'll track. And so the things we, what we wanted to measure, especially with plugins, especially more than WebAssembly was the, you know, it's called the dwell time within the plugin itself, right? For requests that's coming through your pipeline, how much time is it spending inside the plugin itself? And how much time is it spending from just about, I guess the interface from the runtime itself into the plugin? So we wanted to get both those metrics. So we are capturing both those latencies, those dwell times, I guess, to get a sense for how much faster or slower is it compared to what a native plugin might look like. So we have those metrics. We have existing tools that we use to publish these metrics to, you know, the standard ones you can assume, right? So these all end up in the same metrics stream and then they have dashboards for it. Yeah, so the question is how, like what is the deployment infrastructure look, right? We are pretty big about containers, like it's a Kubernetes implementation. So as part of the build process, what we do is we end up with a container and then we deploy that container to whichever Kubernetes controller that we want to deploy it to. So that's really what it is, right? So the complexity here is because we're rolling out a new tech, we don't want to do an all or nothing kind of deployment. So we're doing a variation of blue-green over here where we have the old stuff running, we have new stuff running, we have a load balancer routing, some proportion of requests to either fleet A or fleet B, and then we keep an eye on metrics, we have alarms and whatnot going on it. So fairly standard deployment methodology, I would guess, and then we keep an eye on traffic if things are working as well as we expect over a period of time, then we'll start ramping up traffic. Hi. Yeah, so why do we see performance degradation? The Web as in the runtime is actually an auto process call. So what's happening is a request is in your runtime, but then it's making an auto process call to the runtime. Of course, it's a cold start problem pretty much the first time, right? So given a large number of requests and over time it will average out, but if you don't have sufficient traffic, you will see the cold start fairly frequently. Again, it's an optimization we have to make, we just haven't got to it yet. Again, I'm fairly certain, like the assumption we made initially was we would see about a 25% hit, so we were ready to take that hit. Actually, pleasantly surprised to see was in the tens, I think we can get it to single digits. I think with 5%-ish, 5% to 8% will be pretty happy in that range. Yeah, because it was all running in process, right? Lua runs in process, it was pretty straight. Or C++ runs in process over here. We're just building it as a monolith. The proxies we have, there's a question, what are the proxies? So we have some of our stuff running with variation of Nginx. Some of it is Apache, some of it is Envoy. You probably name it, it's there. So benefit of running a system that's been in production for decades, I would guess. So not all of them. So we only picked a couple of them that are capable of doing it. Any more questions? All right, thank you for coming this afternoon. It was good seeing everyone. And go WebAssembly, I guess.