 So, hi everyone. I'm super excited to be presenting at my third in-person communities on Edge Day. And, of course, unfortunately my co-speaker is not able to make it, but I have a virtual portion of his part of the talk. So, today's talk is all about orchestrating machine learning on Edge devices with Akri and WebAssembly. So, of course, we know that we are in an era where we are having a bunch of different type of Edge devices, primarily a lot of cameras, sensors, and the primary biggest issue when kind of dealing with these kind of devices is that we cannot really run communities on these devices. So, we'll be looking at how you can leverage an open-source project called Project Akri and actually run these or essentially be able to utilize these Edge devices on a standard communities cluster and how you can efficiently do more compute-intensive tasks like machine learning with the help of WebAssembly on Edge. So, a quick introduction. I'm Shoaia. I'm a DevRel engineer at Millicerts and also was a Metz Ambassador. And my co-speaker is a student at University of Toronto. So, I'd first like to start by giving an introduction to WebAssembly. Now, I think I myself have given a couple of WebAssembly-related talks at communities on Edge Day. If you haven't watched them, you can probably look at some of the previous years of the communities on Edge Day and you'll be able to take a look at how you primarily manage like running communities on Edge with the help of WebAssembly. But primarily the goal over here, like I'll just give a quick just of what WebAssembly is, but I'll not be taking a lot of time in explaining this. But at the bare minimum, the WebAssembly is essentially this binary string format for a Watson machine. So, essentially, you take languages or you take programs or functions written in multiple programming languages and you're able to convert them into, again. So, what I was sharing was that we're going to be basically talking about WebAssembly and it is primarily a binary string format and it is designed as a compilation target. So, what does it mean? That you can basically take functions written in multiple programming languages like C, C++, Python. So, you don't really care about whether it's a systems language or even object oriented programming language. You can take functions or modules written in these languages and compile it down into this binary string format and it can actually run across a host of different devices. So, whether it's starting off with it sort of as a browser technology, but we've very quickly realized that it can actually be run very efficiently on Edge devices on server side as well. So, you're seeing a lot of like cloud native WebAssembly these days running on serverless functions on communities on Edge as well. And the biggest benefit, the reason why, you know, it is the way it is, is that the overall size of these WebAssembly binaries is pretty small. So, it's compared to your standard containers which can be a few megabytes to even like hundreds of megabytes in size. Typically, these WebAssembly containers or these modules are 100 or 1000 the size of your typical containers. And of course, since it is a very low level bytecode, the performance that you get is fairly near to what you'll get with a standard, you know, like a standard native binary that you might be generating with Rust or with C++. So, A, you get the portability of being able to use it on multiple different type of devices. So, the WebAssembly module itself does not care on what particular type of device architecture you run it on. And we know that in Edge architectures, whether you talk about risk five or x64, x86, all of these different architectures are very distinct from each other. So, WebAssembly provides your way of being able to just have the single target and then deploy it across on different types of device architectures without having to worry too much about the architecture itself. And of course, the WebAssembly module itself cannot really do anything on its own. So, over here, this is a very famous diagram by Lynn Clark. So, essentially what we're trying to say is that similar to how the kernel kind of manages to make system calls for like, let's say, your applications to be able to refer to your memory or to your storage, how the kernel is making the system calls. Similarly, in WebAssembly, we have this concept of the WebAssembly system interface, which allows you to make WebAssembly system calls to your file resources. Because the way that WebAssembly works is that it's kind of enclosed in this sandbox model. It's basically a security feature that essentially does not allow your WebAssembly module to directly interact with file resources or even make networking calls. In order to make that happen, we need to make the, we have to use the WebAssembly system interface. And this uses the Vosys calls in order to interact with the external file resources or the networking as well. And of course, I've touched upon this briefly, but we know that edge is pretty complicated, right? I think the biggest issue is the lack of compute. Then, of course, heterogeneous architecture because every different type of device architecture that we come and look at is distinct from your regular types. And then, of course, that means that if you're having a system where you're dealing with different type of device architectures, you'll probably have to maintain different binaries or different scripts to be able to run your code or whatever you're trying to, like whatever workload you're trying to run on these distinct devices. And that can basically lead up to additional costs as well. And that is a reason why you should basically use WebAssembly for the edge. I think I've already mentioned about how it is actually a lot smaller in size in comparison to containers. It's also a lot faster. So you must be aware of the boot time or the cold start time when it actually takes to start running any edge device or any container that you're running on the edge. So containers typically have a startup boot time of any typically from like a few microseconds to a few seconds. And again, in comparison to your main containers, the startup time or the boot time for the cold start time for WebAssembly is a lot shorter in size in time. So of course, these are some of the main reasons why you should consider running WebAssembly on the edge in comparison to containers. But that doesn't mean that WebAssembly completely replaces containers at this, you know, just a point to note over here, because of course, a lot of the containers are very mature at this point in time. And WebAssembly still especially on the edge is still growing. So a lot of your typical, you know, networking calls or, for example, like networking right now and WebAssembly is not that optimized or that will build in comparison to containers. So ideas to basically run them together. But here are some of the reasons why you should consider at least. And of course, now I like to quickly introduce the project talk. So I think like the biggest issue then when we've kind of find with edge devices or especially with devices which have very less compute is that they are not capable of running communities. We know that communities requires a lot of resources. So they are simply not good enough to run communities. But imagine if you were able to basically treat these edge devices as communities resources and just run workloads on top of these devices and not have to directly run communities itself on these. And that is what actually essentially last to do. So it will basically detect these edge nodes or these leave devices. And these devices can be anything. They could be cameras deployed on the edge. It could be any sensors, right? And so essentially, after you will allow you to detect these modules or these edge devices and then treat them as standard communities resources and run them similar to how you would basically run any standard communities and be able to schedule jobs or run schedule pods on these devices. So here's a quick look at the architecture. So what you'll see initially is that we have like the standard communities control plane that you'll normally see. Now, what you'll see is over here, we have the leave devices. So you see like all the different type of categories of devices with cameras, sensors. Now, what we have really is, of course, we have the main communities control plane. But alongside that, you also have something which is primarily dedicated towards the ARCRI, which is the ARCRI controller that you see in the previous slide. And also the main things with our ARCRI agent, the custom broker and the product and the discovery handler. So we basically talk about each of these particular things one by one. So the custom broker is what basically deals with this discovery handler. So this discovery hand handler is essentially what is continuously trying to detect edge or leave devices. And as soon as it is able to detect a new device in its vicinity, the custom broker will basically detect that particular device and treat it as a communities resource. And the ARCRI agent will be able to then schedule jobs and run pods on these edge devices, right? So essentially what your communities cluster is not doing is it's becoming a lot more efficient. And you're actually able to make use of these custom resources and these edge devices and then use them as your custom resources inside of your cluster. So I think it makes it a lot easier to work with edge devices. And this is a very efficient way of being able to leverage more and more devices inside of your cluster. So that is a basic idea about how we are essentially being able to orchestrate a host of these different network or these edge devices with the help of ARCRI. Now, of course, the biggest question arises, why use WebAssembly with ARCRI? Because today's talk is all about how you orchestrate these edge devices and run huge workloads with WebAssembly and ARCRI. So, of course, ARCRI is a great way for you to be able to detect these devices and then run communities on top of it, right? Jobs. But now, of course, the question arises, how can we efficiently run these jobs? If you were to try to run containers, it would be very slow because, again, these devices have very less compute. So that is where WebAssembly, as I kind of introduced before, and how WebAssembly on the edge is sufficient comes in the picture and we are kind of like marrying them together, right? So you're essentially running these WebAssembly workloads inside of your community spots. And because these WebAssembly workloads are a lot more smaller in size, and of course, you also get a lot of safety. Because let's say if you're running a very, you know, a privacy-focused workload, like let's say you're running some machine learning workload on the edge that you don't want to send on a server, you want to run it locally and in a safe environment. So the WebAssembly Security Sandbox model will make your workload secure because the WebAssembly model itself is pretty small. So the workload will be a lot more efficient. And of course, you will be able to just distribute it across a load of different type of devices without having to worry about the device specific architecture in mind. So these are some of the main reasons why you should use Wasm with Akri. And of course, what can we do? So let's take a look at an example. So of course, here is my standard cluster, pretty much the same architecture diagram that I showed to you. So inside of our user node, we of course, are the Akri device handler will detect your edge devices. So here I've just taken an example of a camera, which could be detected as in one of the edge devices. Now, we'll basically, as I kind of mentioned, that instead of running your standard containers, we are running our WebAssembly workload, right, on these nodes. So essentially once our Akri discovery handler will detect this particular device, then our Akri agent will run the workload. And in this case, we are basically running a WebAssembly workload. So the shim that you see is the WebAssembly shim. So in similar to how you have standard container dshims, we have the WebAssembly Wasm shim that is running inside of your cubelet. And this cubelet is what is essentially running on top of your device. So this is this is basically the example of what our architecture would look like. But now my co-speaker I'll present the video so that he can give you more details about our demo setup where we are running machine learning workloads on edge devices with the help of Akri and WebAssembly. I think it's the maximum volume. I think the volume might be a bit less. Let me just try to set up for this talk, what would be shrink orchestration of the edge models as I was saying, such that it basically these are all the different type of optimizations that you're seeing directly happening with the WebAssembly binary itself. So as you saw that our use case is primarily running this nerve model. So nerve is a very popular machine learning model, which allows you to create 3D rendered objects by giving it a prompt. So what will be quickly now showing you is a demo of how that basically set up. So in this case, we are using an Azure K3 S cluster and setting up Akri Discovery Handler to be able to detect a device and then actually run our machine learning model inference on top of it by of course making certain optimizations to the actual WebAssembly module itself. So I'll just quickly highlight that demonstration. So that's what we do first and I'll just make this, I'll just make this Q and A to make Yes, and what I'll just do is add a note to these agents and actually see all the agents, the apricontrol. How much is this still called? Welcome to CI. This is also what we'll be using to, which is also what we'll be using to convert our processing to binary. So this quickly showcase what the end result was. Our Discovery Handler, which you can see the source code right now was able to basically, this is what will help us to and we're kind of simulating that and running the actual machine learning code. And this is basically the 3D rendered example of what we are able to generate with the NERF model. And these were running on an edge device. So for us, we simulated a mobile phone as the edge device and ran this on an Android phone, which was able to get detected from our machine learning cluster, from our, you know, ACRI cluster was able to be discovered by the discovery handler and run it in cross our communities cluster, that is our K3S cluster. So this is a quick demonstration of what are the steps, mainly, but of course we'd love to connect with you if you are interested to learn more about the demo. But of course, the main best practices when it comes to deploying ACRI is that A, that because we are able to discover these devices, so you can basically just ensure that you're making configurations for being able to do dynamic resource allocation so that you can individually detect these devices and just have a clear discovery strategy for that. But with that, I'll conclude my talk. Thank you so much. And of course, I'll be more than happy to connect with all of you. You can connect with me on Twitter, I did how develop and I'll be around over here if you want to kind of discover and let's have more conversations about how we approach this entire solution. Thank you. Thank you.