 All right, thank you all for coming. My name is Aryu Ishimoto. I work for a company called Midokura, and we are a subsidiary of Sony Semiconductor Solutions. And today I'm going to talk about modernizing the development of vision-sensing applications on IoT devices with WebAssembly. A lot of history about us. We were founded in 2010. I think I was like the fifth one that joined. Good old days. We developed Midokura, which is an open source software-defined network solution. So we provide a network virtualization for various cloud platforms, including OpenStack. And in 2016, we made a little pivot, a small one. We started to apply SDN for a security purpose in the industrial IoT. So that's kind of where we got our first taste of IoT there. And in 2019, we were acquired by Sony Semiconductor Solutions, where they were planning to build this service for the smart cameras. And our responsibility was to provide virtualization technologies for that platform. And I'm happy to announce that this year, 2023, we've released Atreus, which is the Edge AI platform. The details of that, I'm going to explain a little bit later, but you can also come to our booth. There's Atreus folks there. But today I'm going to talk a little bit more about the WebAssembly part and how that's used in Atreus. OK, so brief explanation of Atreus and the Atreus Vision app SDK, where the WebAssembly is used. OK, so Vision AI, this is a very challenging field. And here are some of the reasons why they're challenging and how Atreus tries to fix those. One, we've found out that the most developers, where you sort of apply, when you develop AI applications in these IoT settings, well, they're not very comfortable with their knowledge of AI. So this is one problem. So we have to make that easy for them, so we provide AI models that are pre-trained. So it's easier to develop against it. The models themselves, they have to be tiny because we want them to run on these IoT devices that are resource-contrained. But if you shrink the models, then you lose accuracy. So there's a game of finding the right balance, and that's hard. So Atreus provides the full tools and services to sort of optimize that. Tuning, yeah, so these cameras, when you deploy these cameras everywhere, every environment is different. You have things change, every hardware is different, the lighting condition is different. So you have to be able to detect potential drifts of the models. You have to be able to retrain, and you have to be able to redeploy. So Atreus has that feature as well. So a lot of challenges we face, and Atreus is trying to solve those. OK, so one of the most important components of Atreus, of course, is Sony's image sensors. One that's here is IMX500, which is an image sensor. And what's so special about this one is that it has an AI chip embedded. So when an image comes in, the influence can be done on the chip itself, and the metadata comes out. So you don't have to really send stream videos to the cloud for applying AI. So when you have that, these AI models are optimized for IMX500. So you have very fast and optimized processing, and you have low latency, low power, because you don't have to wake up the CPU to do anything. And of course, you're applying AI at the closest place where the data is generated, so it preserves privacy. So Atreus is actually not just a device, though. It's actually the entire end-to-end cloud to device solution. So at the top, you have, actually, here, you have cloud services that where you connect, you take IMX500 devices, and you connect to the cloud where you can do the device management, application management, and you can do the ML model management. So you have a basic IoT kind of service as well. So Atreus includes all of that. And today, I'm going to talk about the red box over there, development environment. So platform doesn't really mean anything unless you have built something that's something interesting, something useful. Without that to happen, you need to have developers. You need to have applications. You need to have models. So today, I'm going to focus on the SDKs that we provide. So it makes it easier for developers to come up with something interesting, to customize their solutions using Atreus APIs for their needs. So as I mentioned, there's a cloud API, which I'm not going to go into today. But here, you can provide your own cloud application using the SDK. The device SDK, so it has a lot of components. I'm going to skip the AI parts. But here, the Edge application development, this is where WebAssembly is used. OK. So that's Atreus. So now I'm going to talk about the Wasm. So Wasm in IoT. I think most of you guys know about WebAssembly now. It's a big trend now. But I think most likely you guys have heard about WebAssembly in either cloud service or maybe the browser settings. It's rare to use this on IoT. I think it might be one of the first ones. But there are a lot of challenges. And here, I'm talking about the immediate IoT challenges. I mean, there are many of them, but there's one that I should mention here. So IoT solutions will continue to evolve because devices will get better. You can do more things on it. And there will be more demands to do more things on it. So you have to be able to customize. You have to be able to update functions, functionalities easily, safe and fast and frequently as well. And you can't do that right now. And here are some of the reasons why. First, embedded development is just hard compared to application development. Right now, only low-level languages are supported, cc++, not too many reusable libraries to think of. Sometimes you have to sort of tailor a device you're running your application on. So that means it's not very portable. Also, let's say if you could run applications in some kind of isolated way, in some abstracted way, still these devices don't have memory protection units. So you can't isolate applications from each other. So if you run multiple applications, you're basically having them share the same physical memory and you're going to step on each other. It's not very safe. So let's say even if you do manage to run multiple applications, any change right now in IoT device pretty much requires full OTA update, which is not very desirable. You want to be able to replace certain functionalities that requires updates without impacting others. So for that, I think most people use in Cloudworld, you have Kubernetes and things like that, some orchestration system, which doesn't exist for IoT. So not at the level of these IoT devices, like the MCUs. So we need that. And containers are way too big to run on these devices. And also, by the way, when I say edge, I'm talking about the edge devices as in like far edge MCU class devices, not like the Jetsons and things like that. Although they are a very important part of the edge computing use cases, so I'm not excluding them. But here I'm talking about those MCUs. So virtualization is required for isolation. So what do we do? Well, here's a quick history of that. Physical, OK, I think we all know that. Great isolation, but not very green. Software virtualization came about. VMware, Zen, I think these were good for virtualization isolation, but they're very slow. So that performance was addressed with hardware virtualization, so that was good. But still you're sort of virtualizing the entire computer, and that's kind of too much for a lot of developers. So as you guys know, they're containers. Containers are a great tool for developers to quickly build something and run it. But containers, as I mentioned before, they are too big for IoT use cases. And they also require MMU for isolation. So I think about 2016 is when we heard about WebAssembly for the first time. And why WebAssembly is good for this particular use case is that it's a safe language runtime. I'm going to go a little bit more about what WebAssembly is next, but this doesn't require MMU. And therefore, it is very suitable for MCUs. OK. So wasm, I think some of you guys already know about this, but it's a binary instruction format, and it's designed as a portable compilation target for various programming languages. So why this is good for IoT? Well, it's, first of all, it's easy to call routines from WebAssembly from various languages. OK, one of the goals of what I do and what we want to do with Atris is to build an ecosystem, and ecosystem includes developers. We have to make developers easy to develop these edge AI applications on these embedded devices. So it has to make it easy for them. Different languages are very important. Small footprint, this is critical, because we're talking about IoT devices, multiple language support. I'm going to go over that a little bit more. Memory safety, I mentioned that already. So language support. OK, C++, of course, a lot of people are familiar with those in the embedded world, so that's fine. But we want to be more inclusive, and we want to add more developers that are used to doing different languages, higher-level languages. One, Rust, we're getting very popular these days. TypeScript. And Python, though, I think we don't have a full support for these, but Python, in particular, is a very important language for us. And I'm going to go over that a little bit later, but mainly it's because developers that want to run stuff on these devices tend to be like AI developers, and they love Python. And there are a lot of useful libraries for them as well. OK, it's backed by a strong organization, WebAssembly.org. They're sort of in charge of standardization activities and Bicode Alliance, which is actually implementing these standards. And we're actually part of the Bicode Alliance. OK, so WebAssembly looks good. So now the next is to choosing the runtime. OK, so we looked at a few. This may be a little outdated. I think there's also Wasm Edge, which I wanted to include, but this is an old slide. But this is how we decided with some comparisons here. These are all good options. We ultimately chose WebAssembly and micro runtime for several reasons. There's good support of different architectures, but also the second point, interpreter, JIT, and AOT. AOT, a header type compile, is very important for us for performance reasons. So in AOT applications, you just wait until the slots go through interpreter. So it has to be compounded into the binary format for the architecture. It's much faster. So with AOT, it provides you a near native speed. Not quite there, though. The operating system is also important. Linux supports, of course, but for MCUs, we have to have support for RTOS. And we chose not X because, well, it's a little biased decision, but Sony has a major contributors in not X. So familiarity is a big thing. So we chose the old things combined. LAMR was the best choice for us. OK, so now we have the runtime. So here, we're not just running application on devices. We're actually running applications for vision applications, vision AI, to be specific. So there's more to it. So now we have to make it, we have to provide an SDK to make it easier to create these vision applications. So I was actually at the WasmCon a couple of months ago. And we did a three-hour workshop. And I wanted to show what we did there. I think that gives you a good idea of what our tools and SDK look like. So I use this slide, OK? So let's see. Devices, I mentioned MCUs as the devices that we support already. So that's what we target. We use, also, we have Raspberry Pi, Jetsons. We support those as well. In fact, Raspberry Pi has been a very good device for us for development and demos. It's a very accessible device. Also, it's interesting when the crossover MCUs, because these are good for us, because they're actually more powerful than most of the MCUs out there. So you can do more. But they're small enough that they're still cheap and good for IoT cases. OK, so on these devices, we have an agent running, OK? Agent, you can think of it like a cube led on Kubernetes. It is responsible for lifecycle management of the applications that run on that device. It delivers these web assembly micro runtime for the isolation of the applications. And it also does integration with IoT platforms, because ultimately, the data that comes out of that device has to go somewhere in the cloud, so people can use it and view it and do further analysis. OK, and the integration with the cloud, I mentioned the things board there. It's an open source IoT platform. I don't know if you guys heard of it, we just used that for the demo purpose. But it could be anything. It could be anything that supports MQTT as a communication channel. And the agent provides that communication. OK, so that's running on the device. So how about the application development itself? OK, so first, we provide you SDK. So what does it include? OK, so what are the kind of functions you need for HAI application, vision sensing applications? Well, you need a sensor API, right? You have to get images, and you have to configure the sensors. So you need to have that. You typically would run AI-related functions there, executing inference, for example, loading a model. So we need to have API for that. And in some cases, we may actually have an application communicating with another application. So that communication channel has to be provided to the API as well. So the developers don't have to think about these things. They just have to call API hopefully abstracted in the right level, so it'd be easy to do that. And so that's the SDK. So we also provide a CLI. So you could do development by connecting a device to the cloud and go through the iteration of development, cloud, device, cloud, device. But this is very inefficient. That's one of the feedbacks we got. That's how it was before. So a lot of people wanted to do local development. So we have a CLI that basically provides all the functionalities you need for communication that would happen between the cloud and the device, and also the device or the cloud for data. Most of the data from cloud to the device is like a control path. And the other way is the data path. So through the MQTT broker, you can get all the communications done. The MQTT used for the CLI is just anything. Could be we used, I think, mosquito. OK, so here's the device stack. Actually, the A2S device stack is much more complex than this, but we made it a much more simplified version. I think it's easier to understand it. OK, at the top, you have modules. These are applications. You can have multiple of them running. They run on top of the runtime, which is LAMR. And this is the agent that I was just talking about. That is the orchestration of these. And while this is an important component, WebAssembly standard interface, they provide interfaces like the sensor API, like the neural network API, and things like that. Some API is standardized. Some API is extensions. But it is through the WASI that the applications can actually invoke those functions. And what carries out these functions are these native libraries and device drivers. For the demo that we had in the WasmCon, we have, for the sensor functions, we use sense code, which I'm sure you guys haven't heard of, but we actually developed that. This is a sensor library that we developed ourselves. But since it's Raspberry Pi, we could have just used a lib camera, for example. That's fine as well. We just use sense code because the person who actually did this was familiar with it. They have some extra functions that we needed. OpenCV, we use that. So these are on the native side. So OpenCV, we had to do a lot of manipulation of the image, resize, crop, and things like that. So we use that. And the WASI and then for inference. And these libraries are loaded at the start of the agent. OK, so here's the sum. So this was a setup. Actually, it was much more complex than that. But we couldn't do that setup here. I wanted to do a live demo here today, actually. But that requires some kind of approval process from Sony. And I think I can get that on time, unfortunately. But we do have a live demo at our booth, if you're curious about it. They said you can't detect faces, which is what we did in WasmCon. But we do have toy cars, object detection. So it's more interesting. But I did find a little video that can show you how to do some sort of development using CLI for telemetry applications. Not as exciting. All it does is it emulates a typical IoT case where the application will generate metadata and then you send it to the cloud. So as I said, CLI is going to be the one that's sending orchestration commands and receiving data. And it will also build in the compiler so that the application that you code is compiled to the right format. And so yeah, let me show the video. In the beginning, it's a bit of a promotional stuff. So I skipped all that. And I hope you guys can see it. If not, I will try my best to explain what's going on. So this is kind of like giving you guys an idea of what it would look like if you were to do development of vision-sensing applications using our SDK and the CLI. And what we do here is we're going to start the project. We're going to build the project. And we're going to compile it. We're going to deploy it. And we're going to see the data. Any good CLI should have a help command here. And first, we're going to start the agent. OK, so here. So the agent has connected through the MQTT broker. And from this, this is the VS code, by the way. We can see the log here coming out. This is the agent connecting through MQTT to the container that's running on the PC. Always the help command first. OK, deployment status. All right, so right now we haven't deployed anything. So periodically, the agent is reporting, let's deploy. And right now, I don't know if you guys see it, but you see instances with the empty brackets here. So nothing is deployed. It's empty. OK, so now it's time to build. The application we use here is source sync. It's actually two applications. It's two separate Wasm modules. Source just generates hard-coded text data. And sync just sends it. OK, so it's as simple as you get. It's like, hello, world. But you get the idea. We just built it so that you see now that there should be a Wasm. Yeah, there you go. Yeah, so there it is. It's kind of great, but sync and source Wasm generated. And now we have to deploy these on the device. There you go. Now we get the status. All right. Now you see, before the empty brackets, now you see I think there should be two modules running. OK, there's sync here, and there's source there. Two modules running on the device. And they're connected, by the way. So sync generates the data and gives it the source. And source sends it out to out of the device. OK, and the last step here is we're going to make a little change to the code and redeploy and then see what happens. This is going to give you an idea of what developers go through. Oh yeah, first you have to make sure that the application is running fine. And here, there is a, OK, let me stop the wrong place. Here, telemetry. OK, here. There's a command called telemetry where lets the user see the output of the application, telemetry data. So you can see here, Mikey, my value is being sent from source sync. It's exciting. Now, so now we're going to change the code so that we send something else. Hello, world. Why not? Well, this works because the video, but this really works. You rebuild it, and you deploy, and then you see it. Oh, we're going to empty it out for us to make sure that the, this is optional, though you don't have to do it, but it's actually easier to see it if you do it. Empty means we're just going to stop everything that's running and then redeploy and have a world. So this is as simple as an application in Guest, but I guess it does have two applications running that communicate with each other. So that's kind of interesting to show. But, oops, sorry, get the idea. OK, so that's what we have. Obviously, we have much more interesting applications running. Object detection is another thing that we show live. So again, if you're interested, please, we only have like an hour after this, I guess, but come by. OK, I'm going to talk a little bit more about, that's kind of what we have right now. So here, I'm going to talk a little bit more about what we plan to make from now. And I think these are not just conceptual, but we have some implementations done. And I'm going to show that. So one, this idea of sensing pipeline. So if you think about the object detection application, that can be broken up into these components. Think of each one as an individual separate web assembly module. So first, you have a source module that takes in the frame as an input. And you resize it by 300 by 300, which is the input for the model. Then the next one takes out as an input, and invokes wasi and then to do the inference. And you get the output tensor. Now, do you send the output tensor to the next one, which has the object detection. It actually does the extract the bounding boxes of the detected object. And then you send that out to the next module that does that sort of drawing of the boxes. So it's overlaying the bounding box on top of the original image. And then you send that out to the last one. We call it sync. This is, you could send anywhere, I guess. You could send it to the cloud, but here, because these are images that we're generating here. We wanted to send it to a sense cord, which happens to have the RTSP server capability. So you can stream it. And now you can, once the images come here, now you can go to your browser and actually see the images with the bounding boxes. OK, we actually showed this at the workshop, which we can do it today. Well, you actually see it upstairs, though. Anyway, so what's so cool about this? A sensing pipeline is basically you're constructing a complex application by putting together a bunch of simple ones. Where the simple one, these individual small tasks could be developed by anyone, any team, different teams. What's so good about that is that now you can make the application composable and reusable. So all this time I've been talking about SDK that was targeting developers to implement some of these modules. You can also target developers that don't have any domain expertise in computer vision, AI, or embedded system development. All they have to do, actually, if you can reuse these existing components, you just have to create a pipeline, connect together, and deploy it. So I didn't mean that developers don't really have to worry about. Another important thing is developers don't have to really worry about how their application will look like when they're actually deployed. Meaning they shouldn't have to worry about which devices these applications run, how these devices should be connected. Because every edge environment is very different. They have different devices, maybe multiple devices there. And they change. They upgrade. They change the infrastructure. So the idea being you want to decouple the implementation of the applications from the actual manifestation of those applications at the edge site. And it is the responsibility of the platform to know the capabilities of these devices and how they're connected to smartly place them in the right places. And what's good about the WebAssembly, another good thing about it is its portability. So there you have three Wazen modules linked together. And the platform is going to, knowing what devices to deploy, they're going to compile them into the proper format of AOT. And then it's going to deploy this way because it might have determined that this device right here is not capable enough to run all three. So you made a decision to deploy the first two and put the third one over here. This could be like a network edge compute resource, Mac, something like that. So once you get to this point, developers, they can just use something like local node code tools to just construct things together and just deploy. So this is kind of our vision, our goal, eventually. OK, I mentioned Python, why it's so important for us. OK, this is a little data we got from Stack Overflow. Python is very popular in general. So we want to attract more developers. And like I said before, there are a lot of AI developers who favor Python. So if you want to include them, Python has to be supported. Frameworks, interesting too. You have two out of three top three frameworks, one non-Py. These are Python-specific libraries typically used by AI engineers. So you can tell. This is a very telling number of how they're both using frameworks right now. So Python support. So we've been looking at this. Well, you could try to compile Python to Wasm. I mean, after all, there's some support there, but support is very little. It's not good. So we quickly moved away from that. Some people still working on it, but we want to wait for that. Another thing you can do is you can take the C Python, which is reference implementation on Python, which is in C. And if you compile that into Wasm, and then you can run on top of application on top of that. So this works. We did this, and everything ran except the size of the AOT was 20 megabytes. Waits a bit for IoT. Another way is just to transpile the Python code into C plus plus C, and then convert that to Wasm. But this only supports standard libraries, and this doesn't have enough support of useful libraries that we want because of things like non-Py. So we took the middle approach, and we did a little bit more. So here's what we did. So first of all, use the code. You can transpile into C code using Cython. Now, you could just run that with the C Python that Wasm. You could do that. Just make C Python compile that into Wasm, and then just run it. That works. But we also need these libraries. These are the reasons why we want Python support to begin with, non-Py, and things like that. So we had to go to these repositories, generate objects, and then actually statically link everything here as a single Wasm module. That's all we did. Now, and this works. But that means every Wasm instance can have the entire thing linked together as a whole. So we're going to optimize that a bit. But at the very least, we have these libraries available to Python code, and can be converted into compiled into Wasm, and we can run it. So this is a tool that we have, Wi-Fi to Wasm. And these are the libraries that we currently support. But like I say, it's too big for IoT, so it's more like R&D thing. However, we did start looking into MicroPython, which is the subset of Python standard libraries that is targeting microcontrollers. Perfect. So look how small that is. Very small. So it also includes a library that is non-Py-like. So you may not even have to get the non-Py library static-linked if you use this. And we actually tried the same exact thing we did with the CPython with this one. And it turned out that the AOT was only 1.5 megabytes. And this is very encouraging. And we haven't really done going through optimization, so maybe we can do better than that. OK, better hurry up. So this is the final vision, putting everything together. We have a sense in pipeline. And we did this at the demo. What's good about sense in pipeline, as I mentioned, is that each one of these components may not be developed by the same person, the same team. Here, we actually changed this into Python code. We just chose this one because it was the easiest one to do. We didn't have that much time. So we have a mix of C++ and Python code. And they're linked together, potentially provided by different teams, different engineers. And again, sense in pipeline, the platform is going to do the compilation to the proper format. And that's the optimal deployment onto the devices. And this is a mix of languages, which is one of the major benefits of WebAssembly. So now you can have a C++ developers and Python developers providing what they want to share. And then we can just put them together and run them on various environments. OK, I went through really fast. OK. So yeah, we're going to continue to revolutionize embedded systems development. I think WebAssembly is the right tool for that. And we are in Bycore Alliance. And we're going to continue to we're doing things like we want to propose some standard interfaces for sensors. That's one thing we want to do. And we are contributors to Amar as well. Let's see. So yeah, one of the things we want to do, and the reasons why we do it. AWS Vision SDK is mostly open source, some of them are. So we want to continue to open source our, at least, device side so that it's accessible and people can take it. So developing against their own devices and connect to any service they want. But eventually, right now, you're probably going to, if you want to use the HRIS, you do need the HRIS devices connected to the HRIS service. But the data you get, the data your application generally could be sent anywhere else. It could be to your application. But you do have to enroll, provision your devices with HRIS. OK, so I mentioned what's the extensions. That's what we're doing right now. And I think I made it in time. Thank you, everyone. We have like two minutes, I think, or a few more minutes. If you have questions. Any questions? Also, I'm going to be at the booth for another hour. So if you guys have questions, just come by. And we can talk. All right, thank you. Thank you. Thank you. Thank you.