 OK. Thank you, everybody. So this talk is with Angelo. He is not here. He's a colleague at IBM, so I'm speaking on his behalf. So first is this is Angelo, so if you want to contact him, he's pretty much writing the code. But to be honest, there's also a few other contributors. A lot of people have contributed to this, like Evan, Matt in the past, and then also Bimari Belinda, who was actually the lead for this. She did a lot of the work. Unfortunately, she's at GitHub. Maybe fortunately, I guess things happen. Anyhow, I basically review code these days. Yeah, I know, I know, I know. So what's the problem statement? I think Salaboy, he talked about this. The issue is that a lot of times in a serverless environment, you want to call a service, but you don't want to necessarily wait for the response. And the reason is maybe it takes a long amount of time for the computation to happen or some network communication that needs to be done. So there's a series of use cases. I'll mention some of this. So there's that situation where you don't necessarily want request response. You want fire and forget. And there's other type of models for serverless. So this is just one aspect of it. And for this particular case where you have this problem, the asynchronous component can actually help you. And it will show you that. So what are the use cases? For instance, when you're doing AI, where you have to run a model and compute or actually train a model, for instance, you could be doing that. And you may need to crunching a lot of data before you actually get the result. So you want to request, do all this computation, and then you have the response at the end. That's just one use case. Obviously, if you're dealing with, for instance, that IBM, we do that a lot where we're dealing with existing technology that may be doing all kinds of computation, maybe spinning off a team with a bunch of people working, and then you get a response. In that case, you don't want to wait. So that's the main thing. So let me show you the demo. It's about, I would say, about one minute. And let me put my passcode. A minute and 40 seconds, but I will now rate it. So here's an application. This application, if you notice, it has networking, Knative Dev, and Gress class set to the Async. If you try to use this application and you have the Knative Async component installed on your cluster, if you call this application by default, it will just wait for 10 seconds. So that's what the application does, right? It just waits for 10 seconds. So imagine it's more than 10 seconds, like 10 minutes, 10 hours. And you can see it's waiting and just a curl. But if you now curl it and you pass a header, this is the new innovation that we added, right? Prefer Async, respond to Async. In this case, you'll see that it returns immediately with a 202, which in HTTP is that it accepted it. So the computation will happen. And you can check it for it later. And of course, you have to set that annotation on your application. But we've also made it even better, whereby you can set your cluster to always have Async so that all the services that are deployed in that cluster will be Async. And that's what we're showing you here. So your application doesn't have to have that header. And it will be Async if you pass the header. Now, how do you change your service to have always Async or not? Your cluster, sorry, you have to patch the config network. This works with all the different network ingresses, so Istio, Contour, and so on. So this is pretty much it for the demo. Let me now switch to slides to just give you a big picture view of the architecture. This is maybe the one that's the best to explain. The key thing to realize here is that we are making very few changes to the Knative architecture. Everything in green are the things we are adding. So the way to think of it is when you make a request to a service, so you're basically, say, curling that service, instead of going to the service, it goes to a queue. And then there is a consumer and a producer. The producer is putting stuff on the queue, and then the consumer is consuming it. Basic Computer Science 101. Now, of course, how do you make it scale? And does it scale? I guess that's a question I'm sure a lot of you are asking. Well, we did some initial testing, and this is what we found out. So if you do synchronous, meaning you don't have Async, you get about 153 requests per minute. If you have the Async, we get 392x improvement. Now, obviously, people like Evan will tell you made it. Yeah, how did you test it? Which application? This is an initial result. I'd love for you to try it and let us know. But you get a significant improvement because you're not waiting, right? Things are just putting on a queue, and then the consumer can consume it as fast as it can. So depending on, we're just decoupling. Again, there's no magic. It's basic computer science. But if you do this to your Knative cluster, you all of a sudden have asynchronous requests to all your services. That's the beauty of it. So if you have any questions, please go to the Async component, try it, ping Angelo and myself, and we'll try to help you. So thank you for your attention.