 Okay, hello everyone. My name is Tony. I work in Midokura that it's a Sony group company and in this session I will talk about secure virtualization for microcontroller units using web assembly So first of all, I will talk about the vision of the company. We want an accessible platform that powers intelligent solutions for vision sensors Our objective is to bring to the market easy to use sensing devices lower the barrier for solution developers promote and agile development even for its devices Lower the operational cost that is that you can manage and deploy new applications in an easy way promote the polyglot development that is that you write the code for your embedded system in In the language that you want and Have a marketplace to connect AI developers that creates the models and train train them for an specific Domain and solutions developers where they can Upload pieces of code so that it can be reduced in a much more complex a scenario The vertical applications that we are focusing right now are retail stores a smart factoring and Smart cities and a smart home Okay, I will introduce and explain a little bit the problems of development on tiny IoT devices Development is not agile that is Right now we are using real-time operating systems than in some way it acts like a library The application is coupled with the West and they have to be tested together even though You can separate but in the end the way to test it is by by using the Operating system the coupling of the OS and the application often leads to a waterfall development where You are applying called the testing at the end of the development cycle Embedded devices are typically designed for an specific task So it's not a problem if your focus is to only deploy an application It can be okay, but since our aim is that you can reuse your embedded device by deploying new applications or Applying continuous deployment. This can have some yeah some drawbacks Not safety and isolation tiny IoT devices are based on MC used that's most of them They don't have a memory management unit. Therefore, no a virtual memory And if you want to apply continuous deployment, there is a higher risk and With the target that I have said before of having a marketplace and take code from all from other people It can be too risky Applications difficult to develop Well, there is a high barrier for applications developers because the code is coupled to the OS So you need to know safer or netx or free RTOS and Applications are typically written in C which well, it's not a problem for for us, but for AI developers It can be a high barrier Limited code reduce because you are coupled with the OS and there are our specific interfaces and drivers As I have introduced before there is a technological gap between the tools that the AI developers use and the ones that are being used in the embedded systems So we want to basically a break this this barrier How we are trying to solve that by using web assembly and here I will do a very brief introduction What is web assembly? It is a low-level bytecode format that runs in a sandbox environment It is compatible with multiple languages here. It says high-level languages, but Right now it is basically CC plus plus and rust, but there are others such as Assembly script that it's a type of of JavaScript and also Currently we are supporting Python. So in the next following slides, I will explain how we are doing that the platforms are anything that you can imagine like browser servers and now Even tiny IoT devices, but the only constraint is that the interpreter is compatible in that platform The size is very compact. It is fast execution and high security by definition of how web assembly it is specified Web assembly for sandboxing How many protections or which protections a web assembly has All what the language level provides to you for instance, if you are using rust You are gonna get all the benefits of it But also there is a running level protection That is because web assembly interpreters are forced to check that the memory accesses are within the bounds of Of your application because of the linear memory. There are other types and checks Well, all other security checks for instance the type safety at both a compiler and runtime And there are other protections like for example this control flow integrity to prevent hijacking attacks As I was mentioning at the beginning we want to run and applications from multiple languages to run them in multiple target architectures and multiple OS's basically here in this diagram we have that We are supporting see Rust and Python that we compile this code into a web assembly module And then we have the interpreter compatible with the OS and the target and and the target Architecture and We have other layers Why we need other layers because the web assembly? It is able to run the application But there are some cases where you want this sandbox environment to access the native site for instance If you want to do some neural network computation You could compile the neural network Influencer in a web assembly, but if you want to take profit of an accelerator you will need to go native The way to access the native part. It's by using web assembly system interface. There is an standardization API And that provides a secure way to access the system resources and for instance and web assembly system interface has some standardized layers for the file system the network and for tensorflow or pytorch that in these cases was in N was in neural network There are several Wasm interpreters the one that we chose it is web assembly micro runtime that is part of the byte code alliance This interpreter can run as an interpreter or just in time and ahead of time the OSes and architectures that are supported are very various like x86 a m risk and operating systems Even it's it is supported for safer and ahead of time compilation provides a fast execution that for some computational Expensive applications it is within 2x of native One more also supports some source level debugging that it's very beneficial for us because otherwise Yeah, testing applications. It's it's quite difficult Okay, now I will introduce some challenges for a polyglot environment Okay, here I will start presenting some survey from Stackoverflow and as you can see the languages that directly compiled into web assembly that are CC plus plus and rust are within a 25% of popularity But it isn't interesting to see that Python and JavaScript are around and 50% a 50% of people so if we want to target all these people We need a way to make sure that these languages can directly compelled into web assembly because otherwise you are losing a very extensive market Which are the most popular frameworks? so here we can see NumPy pandas, TensorFlow, scikit-learn and PyTorch within others so It is clear that if we want to target AI for vision-sensing Applications, we need to be able to come and to make sure that these frameworks are compatible with web assembly also in our case and we have and like create a POC Making sure that we could compile NumPy for web assembly and later on I will show an example Okay, we want to empower flexibility and dynamic deployment That is we want to make sure that a complex solution can be broken down in smaller and manageable modules so that you can like Reduce them and provide them in a marketplace and Multilanguage platform that leveraged the strengths and features of the different languages as I said before if you are using Rust It will have the language level protection But if you want to do some AI and you want to do some matrix multiplication or run a neural network Use Python for it. Don't make the matrix multiplication in C which could be but maybe it's not the best way to do it So here I will introduce the motivation use case We wanted to do people counting application and what we previously had was a people detection and a license plate recognition Some components were shared for instance access the camera sensor send data to the cloud or run the neural network So here we have an example of a license plate that was detected and we were Streaming the annotated image with the bounding box and in here in the other image. We have a things board Yeah screenshot where we can see that the license plate the license plate was and well recognized Okay, this license plate it is in Japanese. So it is normal if you don't recognize the characters So what we wanted to do was in fact people counting we already am and Had worked with Firmwood that it's a framework or a neural network That does the tracking and we had several questions. Do we write all the logic of and that it's applied after the The inference do we write it in C C plus plus or Rast and then how we ensure that the codes are equivalent And the other option that we had is how we execute Python directly from from the repository So what we started to do was okay. We tried to To run Python in web assembly. So we took a C Python that is the C interpreter of of Python And we saw that it already contains some configuration and helpers to cross-compile Python into web assembly And as you can see Here the Python that was some it is of type web assembly And by using a wasm that is the interpreter that one or provides and putting some some flags basically to Specify where the Python path is and giving permissions to access the directories. We could run Python in web assembly directly So So but what we wanted to do was running the post-processing of of the of the neural network So what we did was transpiling the Python package of Firmwood into see by using Titan Then we called the Python module using the CPython MPI CPython C API and then we made sure that the modules are frozen because otherwise You have to put the Python pass path because the built-in modules are not embedded into the Python executable But instead they are access at random So here we have a silly example where we have the Fibonacci and what we did here It is to basically apply the translation and We called from the the CPython Glue code This function with an input of 10 and as you can see here with the S trace the only things that you need to access It is here The frozen main dot wasm the rest it is embedded into the web assembly. So there is no need For the web assembly to access anything outside of the web assembly That this is what we wanted to do Now I will introduce a little bit and how web assembly debugging works As I said, we are using web assembly micro runtime and it straightforward and well it has available some BSA code extension that compiles run and debugs your code Quite easily. So for instance, you can install these and these code extension And voilà here we have a sample application that Basically shows a little bit what can support and it's all what you need or at least the most Yeah The most basic features that is the variable view the call stack and putting breakpoints So it is as easy as running this web assembly extension Okay, now I will introduce them the wedge that is the product that tries to integrate and facilitate all the topics that I have explained at the beginning so wedge basically integrates the entire IoT device and lifecycle Management where we focus on the application creation But also the device monitoring and the end to end management of gem providing the modules in the marketplace And making sure that the SDKs that you need and I love available and that can be deployed So here what we have it is on the left the solution in developer that uses the SDK and That then through the wedge cloud that uses the IoT platform to talk with with the devices And it is able to deploy the the web assembly modules into the wedge agent. That is the the piece of code that integrates with Whammer to execute the the web assembly modules So in this part The solution developer what it's doing is using what we call vision sensing application That it's the combination of simple steps to create a more complex and meaningful task As I said, we aim for composable and reusable was a modules The developer is agnostic of in which device it will run the only thing that and it is Like implicit it is the interfaces that you need that they are for Like imported from was it and then we aim to create from low code. No code We want to promote having using a UI so that basically you put the nodes or the modules and in your application by using by using it instead of doing the coding and In the second part what we have is wedge cloud that provides on-the-fly optimization for a variety of Targets that means that we are compiling ahead of time the modules knowing the target architecture And then we are doing the deployment and lifecycle management We were from the cloud you can see if the modules And are running fine and if not the reason because maybe Yeah, the web assembly module for instance is trying to import a washi layer that is not implemented So you will be able to check that the device will still run fine And you could deploy a new application fixing this this error The wedge also provides an SDKs For instance the sensors how to read an image and how to configure the sensor some AI machine learning That are the basic ones how to load the model how to run inference and this part It is not a wedge SDK per se, but it's by using was CNN. That is something that is a standard Then there are some communications API such as send telemetry to the cloud or do or do some HTTP requests and finally some note-to-note Messaging passing and device-to-device And here there is also some data storage for local database or block storage What we aim is that each of the APIs that we see that are required to have a meaningful task We try to contribute to the community so that in the end everything uses something that it's a standard Okay, as we are doing for the AI ML with was CNN and and in one more concretely And what's the wedge agent? Well, the wedge agent is like a Kubernetes in IoT devices it basically automates the lifecycle management of workloads and Yeah loads the the web assembly modules and makes sure that they are working as expected and it is reporting some Status of them into the cloud and it leverages Webassembly microgram time as I said before the wedge agent devices stack is basically this you have the hardware the OS The native library and device drivers where these native libraries are the web assembly system interface implementation for each of the devices And then we have the wedge services API's that are not a standard yet And the was he layers that are the ones that are a standard and finally on top of that We have the web assembly microgram team that manages the end modules that you have deployed into your device So concussions the state of the IoT devices it feels it seems as As what it's happening in the Android ecosystem There are a huge variety of different devices, but in the end the application that you are What we want to aim is that the application that you want to run it is And the same in all of them. You don't have to download a different What's up you go to a web? Will store and you install it directly. You don't have to Specifically say hey, I want what's up for my Xiaomi or I want what's up for my Google Pixel So we want to to have to have that and we are open sourcing the wedge agent So in case that you have any question, you can contact us in info at Middlecura.com and we will be very very happy to share information with you And then here we have we have a demo That is for the people counting BSA. Okay, let me so basically these people counting BSA. It's running on a Raspy with the coral accelerator for the for doing the inference and The the what the the cam is a raspy cam version two as you can see here We are using things for for doing the device registration Okay, where you basically have to put the name of the device and put the certificates so that the device can Yeah, the device can map into the IoT platform Okay, so as you can see right now and the device it is still not reporting any attributes And after connecting it we can see that there is no modules running there This is what we aim for a UI that Provides a low code node code that in this case is using not read but we are improving this part because we see that Not read this limit limiting us Okay, and Basically, what you need to do is know the device ID So that you can deploy for a particular device Okay, and now what we are gonna do is create the application Basically, we're here on the left. You have some modules that can be reduced Okay, in this case the BSA is the people counting BSA and the people counting BSA has One input and one output So the first note it is the note extract that it's basically the note that it's getting the frame from the sensor I say I have said it is a common in all the you know the VSAs that we have been Developing Then we have running the inference of the model. We are applying the firmware to post-processing To add the tracks Then we're applying The tracking logic the counting logic and finally drawing in the image The tracks and the last note is send image so that we can see and yeah in the video stream Yeah, the application running here. We're adding some other connections Because if you want to synchronize the notes you have to and to send this kind of of information Okay Finally when you have your application already created you can go and take the BSA for from the marketplace Connect that with the deployment Connect it with the device and finally doing the deployment into the into the device that you have selected As you can see here in the left it is coral running inference because otherwise The performance is quite quite bad. So as you can see there is this is running in the in the rasp and This line basically and it is for counting how many people is crossing from one area to the other and Information that you see in the in the right is basically how many counts There are there So as I have said the only parts that are new from compared with the other applications where running the code of the Firmwood that was written in Python and All the rest was shared With the other applications that we had like getting the frame and running in France and sending the image to to the cloud and That's it for the video and that's it for the presentation. Thank you very much Any question What's the performance impact of the example you just showed compared to just using the Okay, and there are some options to try to avoid doing some memory copies from native to what? WebAssembly and then sharing it with with the native again So for instance the reference type that what we what you could do It's store the image always in native and only pass a reference not a pointer because you would be breaking the WebAssembly idea of Giving information from the native but just sharing an index so that it can be reduced in this case We were doing the main copy because still the reference types And are not as solid as we as we want But yeah, like ideally we shouldn't do a memory copies as you are as you are suggesting The problem here is that was CNN that it's the API for running the inference right now What it's doing is expecting them that you are passing the data So we should rate it with the community and see how we could fix that Yes, so I wanted to ask you about the software updates like those models Maybe you know some of them fairly large in size So can you update separately if you want to update the model or part of the application? How does it work today? Okay? It is not shown here, but the WebAssembly module contains the right now in our case the URL So that the model can be downloaded What we are aiming for the next iteration is that you can configure the module and that you can have the code as it is but put the URL There dynamically and of course and this is a problem not just for WebAssembly But in general for embedded systems usually what you do is you quantize the model and Try to reduce the size as much as possible But it's a problem in general not just in this case and for instance to run it in the TPU You have to quantize it and compile it for the STPU and in this case is coral So yeah, like we have the restrictions in all over the place Thank you so much In fact, sorry the highest bottleneck here in this case to get three frames per second That we are exploring how we can improve that was there running the the inference Because we are and yeah tight with it Any other question is there any question from the live presentation I have a question regarding debugging Can you for example debug? Was Yeah, well basically as soon as you have the dwarf and Inside the WebAssembly you you can do the debugging and we have tested it for cc++ And I think that it in Rust is also possible, but for instance in Python What I think that you could debug is there the glue code not the Python code I mean the glue code is the one that calls Through the CPython API the Python package that it's compiled The Python package won't be easy to debug, but the glue code could be I Had one question. So you mentioned about the project being open source. So is the project open source right now? The web change is still not open source And I think that the last final announcement will be in the coupon of North America, but we are still Yeah, deciding how we do that, but for sure our aim is to open source it and could you probably just describe the reason behind Why you chose the specific WebAssembly runtime to be the one that you showcased instead of wasm time or some other web assembly Like what's the reason of choosing quamer? Yeah Well, because the wedge agent is written in C So for us what's was more and easy to integrate with and also because it supported NATX That was a platform that was something that we really wanted to target at that point of time As I said, we are part of Sony group company and it was important for some cameras of SSS of Sony and have you explored wasm edge as well as a potential runtime? Yeah, like our idea is that the wedge agent is not coupled with one way only so that we Could a switch from one to the other we haven't testing it Tested it, but yeah, like hopefully we could integrate with multiple ones. Okay. Yeah, I Don't need a long explanation, but how were the dependencies handled? For example, I don't know if some of the modules depended on OpenCV or some other libraries Like normally Python just links to a C version of the library How is it handled in wasm? like If you're writing a code in C that uses OpenCV here You have two options or do embed OpenCV in wasm, which you are basically Losing some performance and if not what you have to do is to put the OpenCV as a native library expose the native library with a wasi layer that In this case OpenCV won't be It is not but there is no wasi layer for image processing right now So what you could do it is think of a wasi layer and try to contribute to the community so that it gets standardized And finally from your Python Python code you import the headers of this wasi layer in the case of Python It's a little bit different We don't have a wasi layer for NAMPAI for instance what we do what we did what we did it's compile And froze the NAMPAI module into the web assembly directly Okay, so it's a As I said we aim to target multiple languages and each of them they have their particularities Is that clear now or? Okay, so I have one more question So you mentioned that this is also applied for camera So I assume that you may have use cases where Let's say the AI model needs full frame image of the camera But since you're running a camera you may have use cases where you need to do crop or additional adjustments Scaling of the image for the camera purpose So how you deal with that problem or do you address any of that problem where you actually need to have? possibly a separate, you know type of frame for ML versus an Application using the camera as a camera on that device Yeah, like There is a module that it's called openCB here It and it's related with the with the previous question and what we did it's we implemented a Layer that is not a standardized. We put replaced openCB native and we basically Created the signatures and the functions for Calling for instance the resize because it's not the same one network to the other it depends on the input tensor shape But yeah, like right now. It's not a standard So the modules that we were creating for that cannot run on other interpreters right now if in the future this is standardized it could Run in yeah in all the interpreters that implement the was he layers That's why we are trying to promote that all the steps that we see that we need from native Are our standardized not just because we and get profit of it, but also the community in general And as I said like the previous question like why we are using was sometime I sorry one more it is somehow related with it and it is Like if we want to switch from an interpreter to another we have to really push hard to make all these Layers are standardized. Otherwise, we will be quite coupled to one Thank you any other question no more questions Any question from the online view? No, then thank you all If any question we