 Hello, everyone. Welcome back again. I think I have the honor of doing the same talk twice in the same day, so that's something new for me. Again, let me, if you missed it earlier, essentially what happened was we started a little early, so I'm doing a rerun again. I'm going to introduce and start from the beginning. My name is Sohan Maheshwar. I work in a startup called Fermion, and we are in the WebAssembly space. And I've been working in the cloud and tech for a while now. I used to be with AWS before. And essentially, I'm super happy to be here, because A, even though I live in the Netherlands, I'm originally from Bangalore, so it's always nice to come back to Bangalore. And also, I'm glad we're talking about sustainability in tech. Because tech is at the heart of everything that we do. Latest estimates say that the software industry has the same amount or similar amounts of carbon emissions as that of the airline industry, which really is alarming. And all of us as decision makers in tech, it's time we can do something about sustainability. So I have to start off with this formula, which is defined by the Green Software Foundation. They define software carbon intensity of a workload in the cloud as C per R, which is the carbon per rate. Again, this is a rate. So it means per user or per API or per machine learning run or whatever. And the C can be expanded to O plus M, which where O is operational emissions, which is what is the carbon emission of your software that is running on a piece of hardware in the cloud. And M is the embodied emissions, which means for that hardware to exist and to run multiple pieces of software over a long period of time, there are carbon emissions. And your software is a fraction of that. Those two things combined per rate is the software carbon intensity. And essentially, I want us today to take a look at everything in terms of how serverless web assembly can make your software more energy efficient, where you use less electricity to perform the same function or make it more hardware efficient, where you're using fewer physical resources to perform the same function. So these are the two lenses that I want us to think about today. So let's dive into it. And actually, let's talk about WebAssembly itself. Anyone here uses WebAssembly or has tried WebAssembly? Few of y'all. Awesome. So just to tell you what it is, essentially, it's another bytecode format. That's the boring answer. WebAssembly is just another bytecode format. It was designed as a portable compilation target, which means you can write your code once in any language, like Python, Go, Rust, JavaScript, whatever. Compile it to a WebAssembly format and run that code anywhere, as long as that anywhere has a WebAssembly runtime. Now, this compile once, run anywhere, is possible because of runtimes, which are meant to be very portable. Similarly, if I say the word wasm, it's basically just short for WebAssembly. They're one and the same thing. This was invented and came about for the browser, and hence the name WebAssembly. But sometime in around 2018, 19, people thought, hey, WebAssembly is fast. It's portable. It's secure. All of these things might make it good to run in a server side as well. So originally, it was only for the browser, but now you can also run it on the server side. And it was made possible by something called WASI, which was invented in 2019, which is WebAssembly Systems Interface. Now, for anything to run on a server, you need things like system clocks. You need access to files. You need random number generators. You need all of this for anything to be a server. And WebAssembly didn't have that until WASI was invented. So this is a very new piece of technology that's actually available server side. And the good thing is it works independent of a browser. So all of the security sandboxing model of WebAssembly in the browser is available for us now on the server side as well. And it also includes this sandboxing for things like input and output. And just to visualize how it works, you have your code. Write it in any language, TypeScript, JavaScript, Python, Go, Rust, whatever. You compile it to a WASM format. And this can run in any place that has a WASM runtime. So it's independent of architecture, so Intel, Mac, ARM, it's independent of operating systems, so Linux, Windows, Mac, whatever. You can run it on Kubernetes, on your Raspberry Pi, in the cloud. You can run it in any place as long as there is a WASM runtime. Now, the good thing about WebAssembly is that all of these things that I mentioned to you, and we are going to look at it through the lens of this open source framework called SPIN. It's completely open source. Check it out on GitHub. About 4,500 stars right now supports 15-plus languages. It's a framework to compose serverless WebAssembly apps. And some of the benchmarks that we have done today in comparing WebAssembly, serverless, versus other things that exist in the market right now are done using SPIN. So there are four things that make WebAssembly pretty cool to run on the server side. And again, take a look at things in terms of being energy efficient and hardware efficient. When you're looking at scale, a metric like binary size actually makes a big difference. Think of all the deploys that we do on a day-to-day basis. Think of the number of deployments and builds that we run just for our testing on a day-to-day basis. Just to give you an example, a simple Rust Hello World compiled natively is about 2MB. And if you do an ahead-of-time compilation, where you are optimizing it for the architecture and the OS, you can bring that down to about 300 kb. Similarly, a SPIN app, a HTTP API, which I will show a live demo of, you can do a just-in-time compilation of where the size is about 2.3MB. And you can optimize it ahead of time for about 1.1MB. And again, these are really small binary sizes. If you take a look at any comparison with apps that we run right now. So I can just show you a small piece of code, perhaps. So this is the CLI. If I can bring it, there we go. So I've installed SPIN, of course. I'm just going to do a simple SPIN new. Maybe I'll link it. I'm going to choose Rust, because Rust is low level. And I'm going to call it Qubeday India. This is a simple Hello World application. Nothing extremely fancy. So I will do Qubeday. And I will open this in my Visual Studio Code right here. All right. So anyone here familiar with the concept of serverless in general, or has used either Lambda or Azure Functions? Few of you all, I'm sure. Yes, perfect. I'd be surprised if people weren't. Or did I open a Swift? Oh, I'm so sorry. I'm just going to do this again. I want to show it to you in Rust. My bad, folks. I'm just going to do a SPIN new again. It's hard to look at the screen there while writing code. OK. There we go. That's Rust, finally. OK. I'm going to call it Qubeday this time. All right. And I'll say, and I'm going to open it in my browser. Sorry, in my VS Code. Yeah. So you don't have to be familiar with Rust and things like that. It's fairly simple. This code is straightforward. If you're familiar with the concept of serverless, it's a simple request in, request out. So there's a request that's coming in. You can see it here. And there's a request that's going out. I'll just change this to Qubeday for now. I don't know if that spelling's correct. Yeah, it is correct. I think my eyesight's going bad and I might need glasses. OK. I'm just going to do a build where it actually just builds this code. And I can run it locally. And we'll actually see the binary size of it just so that you know I'm not just putting a metric out there. There we go. It's just building the entire app. And so this is using SPIN, the framework that I was speaking to you about. And the build is done now. And I'm just going to do something called SPIN up, which basically runs this app on a local host. And I can just do a simple curl to the local host over again. There we go. Yeah, you can see hello Qubeday. So it's actually running right now. And if I go deeper into the directory, if I do an ls-l, you can actually see the wasm file. This is in byte, so it's about 2.3 MB. That's what I was speaking to you about. The binary size of HTTP serverless app is only 2.3 MB when it's written in WebAssembly using SPIN. So I'm going to go back to this right here. Now, typically when you're taking calls about metrics like sustainability, there are trade-offs you have to do. And the trade-off in this case is the startup time itself. Because when you're actually building something natively, it's a little faster than when you're building it on when the startup time in SPIN and WebAssembly. So it's about 2.3x slower than compiling it natively. The other advantage is the portability where you can build this once and run the same piece of code anywhere. The same 2.3 MB file that I showed you will be able to run across platforms, across OSes, and so much more. And the fourth thing is security. Because WebAssembly is security sandboxed by default. So nothing can access that file that I just showed you. So as a WebAssembly module, it's completely security sandboxed. Even if it needs access to a file, you have to explicitly give it access to a file. If you have to make an outbound HTTP call, you have to give it access saying make this call. Similarly, for an inbound HTTP call as well, again, this provides a greater deal of security too. Just to compare it with Docker, because I'm guessing that's a piece of software most of us are familiar with, on the left, you see, again, SPIN WebAssembly app in Python, a simple app. On the right, you see a Docker file with Flask. If this is the file size in Docker, the same thing in SPIN looks kind of like this. And again, in our benchmarks, we actually saw a difference of 23 MB to 550 kb. Again, look at it in the terms of being more energy efficient. Imagine making builds and running production workloads at the scale using WebAssembly, where the size of your binary itself is reduced so much, thereby reducing your carbon emissions. So we spoke about WebAssembly. Let's talk a little bit about the serverless aspect. I'm assuming most people here are familiar with the concept of serverless. I'm old enough to remember using data centers back in the day. You had this machine, a physical machine in your office that people guarded very secretly. And you had to keep an AC running so that it's cool. But we have gone from that to things like EC2 and virtual machines, to containers, Kubernetes Docker, to the concept of serverless. And we really see the next evolution of serverless being in the WebAssembly space because of these regions. The reason that I just spoke about. To understand the concept of serverless means you have to understand some of these concepts behind how the system is actually being used. When I talk about serverless, I could either say it's a type of application which is short-lived. It's completely event-driven. When an event happens, a piece of code runs in the cloud, gives you a response back, and that's it. You can also think about it like a development model where you, as a developer, you don't have to write any server code. You don't have to write server demons or maintain that server. All you have to do is focus on your business logic. And a lot of people refer to this as fast or functions as a service. It's become popular off late because you're focusing only on your business logic, the scaling is faster, and you're utilizing your hardware better. That's the key. And because of this, you're more green, and you save money. So how do you utilize hardware better? To understand this, you need to understand the concept of multi-tenancy. And this, I think, is the key to understanding why serverless is actually great. Because with multi-tenancy, you're saying you have a piece of hardware in the cloud. You want to run as many applications on that as possible to extract value of that piece of hardware. Imagine you had this huge server running somewhere, but just one small piece of code that ran there all day. You're not extracting enough value from that piece of hardware. And you're not being hardware efficient. With the concept of multi-tenancy, you're running many tenants in that same piece of hardware. Each of those tenants are independent of each other. So they're workloads, they're usage patterns. They're all independent of each other. And the more independent they are of each other, the better you're utilizing that hardware. And overall, over a long period of time, the value of that piece of hardware has actually your extracted value. In fact, we ran some benchmarks. So on your left, so your left on my right, on that side, you will see there were two functions, two serverless functions, running in the cloud. And the idea is the gap between your peak and the gap between the average should be as low as possible. Because fundamentally, as people designing for the cloud, you don't want to design for your peaks. And because that's when you get more expensive and your carbon emissions are probably higher because you have idling machines. So on the left, you can see there's function A and function B running. Your average is about 10, and your peak is at about 17. So that's a difference of seven. But the minute we added on the right, the minute we added two more functions, so more tenants to the same piece of hardware, the peak and the average, of course, increases, but the gap between the two reduces. And that's really where the sweet spot is, reducing the peak to average gap to extract value of a multi-tenant system. And this multi-tenancy is actually increasing with each wave of cloud computing. In the early days of your data center, you had just the one tenant on one piece of hardware. But with things like EC2 and virtual machines, you could run few apps on one piece of hardware. Now, with things like Wasm Runtimes, you can actually make very, very dense workloads, where you have multiple tenants running at the same time, only because of how lightweight these programs are. You saw the binary size of only a couple of MB. So you can actually pack these runtimes and pack these applications in the same piece of hardware using WebAssembly, right? So the idea of serverless WebAssembly suddenly becomes so much more powerful because you're saving money and you're becoming more green. Now, I used to work in AWS, and there's a very interesting paper that you should read if you're interested in serverless. I've linked it at the bottom. AWS Lambda is based on the Firecracker VM. It's a micro-VM. And that team wrote a paper where they decided the six ideal characteristics of one serverless unit. So if we were to have the ideal serverless unit, what are the characteristics? The first one they said was isolation. Like I said, running many tenants on one piece where they don't interfere with each other because we wouldn't want that. The second is to have density, right? Where you can run thousands of apps or thousands of functions on the same tenant with minimal waste, right? Extract value from that hardware. The third, of course, is performance when you can bring near-native performance on the cloud in a multi-tenant system. Nothing like it. The fourth they said was the ability to fast switch, right? When an event triggers that application, it has to start, do the thing it's supposed to, and then shut down, yeah? So the idea of a cold start time. And lastly, soft allocation, which means you can overcome it resources like CPU, memory, disk usage, and so on. And of course, compatibility, where at the end of the day, we all want to use our libraries and some other pieces of code which are compatible with what you're running in the cloud. So we actually did a benchmark between a micro-VM like Firecracker and WebAssembly using the same thing. So in terms of isolation, both provide good isolation because micro-VMs like Firecracker are sandboxed completely, and WebAssembly, by default, and by design, as a technology, is sandboxed already. There are a couple of places where WebAssembly really shines, right? In terms of overhead, you run thousands per node. With the micro-VM, you had to run 48 cores at 382 GB and 3,360 GB of RAM on a disk. But the same thing on WebAssembly, because it's so lightweight, you were able to run on 8 cores with 32 GB RAM and 100 GB disk. And if you're wondering why, what makes it so lightweight, it's the technology itself, right? If you look at the innards of how WebAssembly is designed, the reduced instruction set, so it's smaller already, and it also uses a stack-based VM. So again, it's a little more efficient in how it's built and eventually compiled. So that's a different talk, but that's what makes it so lightweight. In terms of performance, again, micro-VMs are near-native, same with WebAssembly, so very similar performance. In terms of the ability to fast switch or to cold start, again, WebAssembly really shines, because things like Lambda and Azure functions do have cold start times. And how companies actually change that is by keeping the instances kind of warm, which is not really serverless, because you're not scaling down to zero. So your cold start times are typically about 125 ms, whereas with WebAssembly, we were able to see less than 1 millisecond cold start times. So the ability to fast switch is very good. The thing is, WebAssembly is untested in large production, so over the next couple of years, we will see what performance will be at scale, whereas, of course, micro-VMs are running in production with over-subscription ratios as high as 10x. Also, we'll probably get better. And in terms of compatibility, in micro-VMs, you have Linux plus KVM only, whereas with WebAssembly, it's platform agnostic, OS agnostic, architecture agnostic, supports a bunch of languages, as long as they are compatible with Wozzy, which is what I mentioned earlier as well. As you can see, there is very good performance, and it really seems like the next evolution of serverless computing, which, again, becomes more energy efficient and hardware efficient when you use it. We have a very quick demo, and this is a prerecorded demo of how we were able to run 10k apps or make 10k calls to about 500 apps in 10 seconds using WebAssembly. So this was actually run locally, so I've prerecorded the demo. This is the architecture of what you're going to see. We're using Nomad, basically, as our orchestrator here. And there are 500 spin apps and about 10k calls to these spin apps. And essentially, the same app that I showed you earlier, the Hello World app, those are the apps that are running here. This is our Nomad dashboard. You can actually check if it's playing now. As you can see, we have traffic. We have Bindle Journal. We have our own multi-tenant version of spin called Spin Multi-tenant, and we have our own garbage collection as well. So what you're going to see in this demo is there's a text file with 500 URLs of these apps. And we're going to do a load test of 10k calls. And we'll see the memory and disk usage at the same time. So we're going to start with the deployment of 500 apps right now. The key thing to actually check is the memory usage that you will see right here. This is running on my colleague Kate's MacBook Pro. And these are all the apps. We have about 500 simple Hello World Rust apps that are running. So as you can see, there's CPU on the left and memory on the right. Right now, it's just about 20% CPU and about 50% memory, standard computer that's running on a local MacBook. And we have a text file here with the 500 app URLs listed. So you can take a look. These are the top five. We're just showing you what's in the text file, essentially. And yeah, if you do a curl to any of these files, you get the simple Hello World. And that's it. Now, we built our own load testing tool because engineers like to engineer. And you can actually see in this load testing tool, it's very similar to Siege or any other load testing tools that you may have used. That we have said there are going to be 25 concurrent users. And the number of requests per user will be about 400. So that's 10,000 in total. And we'll also add some jitter to these calls to mimic the real life. So we'll add a delay of about 0.01. And keep a look at the CPU and the memory usage on the left when this starts. So we're going to start this right now. So it started. And you can see the CPU and memory usage slowly increase. It goes up to about 90% CPU. Memory goes up to 52. And this is done. So in a latency of about 9 milliseconds, the average length of test was 5.8, which I think is quite good. And if you look on the left, you can actually see the CPU and memory already come back down to normal amounts that it was before the test. And again, this is because of the performance of WebAssembly. It's able to bring up all of these apps, make calls to them, and then scale down to zero in no time. We're taking very less memory. And again, being more hardware efficient and energy efficient. If I had to do a two-slide review of where we are at in terms of hardware utilization today, this would be it. There are so many reports of how, with things like Kubernetes clusters and cloud deployments, there's so much unused CPU and completely underutilized CPU out there in the cloud. There was a test, or there was an article by sysdig recently. And they said 69% of CPU is unused in containerized cloud deployments. There's a very interesting thing which is mentioned in the second bullet point there, which is an idling computer actually spends about 30% to 60% power. And there is a non-proportionality to how this works. Because the more you use it, doesn't mean it increases proportionally. You're supposed to use a system that's actually running. So the way to get maximum value from a system is by using it, not by keeping it idle in the cloud, because that is not being hardware efficient. That is actually increasing your carbon footprint. So think about getting maximum utilization of that hardware. The way to think about it is the right serverless is the solution. In another report, they said that about 81% of Azure functions are invoked once per minute or less. So there is a significant cost of keeping these serverless applications warm, only to provide that speed and that lack of cold start times. And usually, you as a customer take up that cost. So again, maybe with WebAssembly, some of these things will slowly start changing. And we know that this is a priority for tech these days, because just, I think, a month and a half ago, Meta launched their own serverless service as well called XFAS, where they're claiming about 60% utilization of their hardware, which is very good. Of course, by using that, you do have vendor lock-in, because you're locked into Meta solution. But I think it's worth checking out. The thing I'm really excited about is this thing that was just launched by the bytecode alliance. This is the CNCF equivalent, but with WebAssembly, called the component model. Now, there's a very interesting thing called the 2,400-hour problem, which is if you take any random thing like, say, parsing URLs, fairly simple to do. You get a host, a protocol, some data. Every language has multiple URL parsers of its own. Rust has one. Python has one. Whatever. What if we could use just one that was created and across languages? And this thing called the component model will actually give you that promise very soon. So say you're building a JavaScript app, and you need a YAML parser, and you need a date formatter. You don't have to write one from scratch. Nor is there a necessity that you have to use a JavaScript YAML parser or a JavaScript date formatter. With this component model, you can get a Rust, YAML parser, and a Python date formatter. And that's it. And again, you can be more hardware and energy efficient. This was just launched. How it works, essentially, is you have these components, or core modules. And any module can export some data, and any module can import data. These modules don't even have to be written in the same language. One can be written in Rust. The other can be written in Go. For example, both can be combined to create a new module which now does something. And this opens up so many different possibilities of how we can write software compared to how we have been writing software. Check it out. It's very new. It's still, of course, there's a lot more work to be done by the bytecode alliance and all of that. But this opens up possibilities because how we build CI CD now looks like this. On the left, you have a YAML file. It's something written in Go. It's something called OTH. And you can see there are multiple Go versions in the same YAML file. There are multiple builds. This is how we write our YAML files for CI CD. On the right, you see Docker. This is straight form the Docker documentation. So for each build, for testing, for packaging, you're throwing in your OS, different language versions, different architectures. This is a huge binary. This has so much footprint. In the component model, you wouldn't need any of that. All you have to do is put one build. You have a runtime which already works across architectures, across operating systems, across whatever. And you don't even need to package anything because as components, they are already packaged. I don't even need to do any bandwidth testing here. You can tell that this is already more energy and hardware efficient just by looking at it. So this is really what the component model unlocks. So keep your eye out on it because I really think it's going to change how we start building serverless and event-driven and microservice architectures very soon. Anyway, for next steps, you can check out. If you want to build your first WebAssembly app, check out. It's on GitHub. The 2,400-hour problem, there's a blog post written there. You can check that out as well. And also check out this blog post about carbon-neutral AI inferencing. We had friends from Civo speak here as well. Civo, basically, also, we work with them. And we use their hardware to do AI inferencing. And Civo works with Deep Green. And Deep Green does something cool, where they take the heat generated in data centers to actually reheat swimming pools in the UK. So your AI inferencing can actually be carbon-neutral, which I think is a very cool thing. So different ways to actually look at sustainability in tech. Anyway, I hope that opened your mind to the idea of sustainability in tech. Do check out WebAssembly and serverless. I'll be around if you have any questions. It was lovely talking to you. Feel free to connect with me on LinkedIn or wherever, and enjoy the rest of the conference.