 Hey, how are you doing? I'm Xinhuan. I work for Intel. Today, I will talk about the cool features of WebAssembly Microronetime for IoT and embedded devices. In this topic, we will talk about why WebAssembly is great for IoT, the one more project history in the status, the features of this project, and then we will give more details about the interpreters ahead of time compiler and the loader, execution in place, source debugger, application framework, and the give a quick start. WebAssembly was created for improving the web application performance. And nowadays, it is viewed as an important technology for much larger scope. The Watson has several great features which are extremely useful for IoT space. First, the Watson is a compilation target for multi-frontend languages, which enables more developers in the IoT programming. The isolation with Watson sandbox and a small footprint is especially useful for those embedded devices. High performance, portability, and the standardized embedding API make it possible to create unified application framework over heterogeneous devices. Finally, the SMD and multi-threading features can work for heavy computing scenarios such as AI and image processing. With these features, it can help IoT systems to overload computing among cloud, edge, nodes in a flexible way. The application development for embedded devices can be independent to those base formula, the innovation can be greatly accelerated. The sandbox can even enable third-party application runs on your devices. The Watson is also good for automation applications because of its good time determinism. Let me give a quick introduction about the project background. The Walmart was open sourced by Intel in May 2019. It was originally created for some internal usage, including cloud and embedded. So it was designed targeting small footprint, high performance, and great adaptability. Walmart was transferred to Bicode Alliance in November 2019 as one of the initiating projects. And in 2021, the open governance model and the technical steering community was established. Walmart has active community and broad adoptions by commercial products and open source projects already. The usage is cover smart contract, IoT, service match, trusted computing, mini app, etc. To have all the content displayed, so from this page, I turned off the camera. And the features of Walmart, Walmart has full support for the WebAssembly MVPs back and also provided aggressive support for the post-MVP features. And Walmart supports interpreter, including fast and class commode, ahead-of-time compilation and just-in-time compilation. Walmart is a C-based implementation, and the compilation is based on LRVN. Walmart has a very small binary size and memory consumption. The VM core only has 60K kilobytes for LT mode and 90K for interpreter mode. And to run a simple hello world, it only needs a 3K memory. It has a near-lady performance with LT and JIT. Also, the faster interpreter provides very good performance as well. The ahead-of-time module order to enable the LT for multiple environments and the execution in place can have the compiled module directly loaded from the flash. It has LT-oriented wasa-application framework and the API also support remote application management. It supports libc-wazee, libc-beauty-in, multi-thread, multi-module, SMD, and the API for embedding, resource debugging, and application management. The Walmart already provided good support for different CPU architectures such as x86, ARM, MIPS, ARC, extensor, and RISC-5. And regarding to the operating system, it has been ported to Linux, Windows, macOS, Android, Difer, RDOs, things, Vixworks, RT-Strat, and OpenArters. Walmart has three building blocks, VMware Core, Application Manager, and Application Framework. These three blocks are separated and self-contained. So if you want, you can build your own application framework and the management on top of the VMware Core. Inside the VMware Core, we have several components. So from the bottom is a shared library used by the VMware Core, including the memory allocator, some util library APIs, and the platform porting layer. And then is the native-lib engine, including native-lib invoker, native-lib manager, thread manager, and the libc supporting with the beauty mode and the wazzy mode, and the libc supporting for pthread. Then we have two engines. One is the interpreter engine and the JIT LTE engine. In the right side is the application framework. We already supported a few APIs, such as Timer API, the inter-application communication, pop-sub mode, and request-response mode, and also sensor API, and the API based on little VGL, which provides the 2D graphic interface. The application manager will provide a communication service to remote, so you can install, uninstall, and query the existing apps through the application manager. The interpreters. The VMware now supports fast interpreter and clask interpreter. The clask interpreter is stack-based. It has a smaller average bytecode, and it's based on stack operation, so as well as we have the overhead and a smaller memory usage, and the fast interpreter is a registry file based. It extended the bytecode in memory, so during loading the wassen, it will precompile the wassen opcode to the extended bytecode. It will have a larger bytecode size, but it provides extremely faster execution speed, and also consumes more memory. So here is a measurement on CoreMark workload. So the score reported by CoreMark, the fast interpreter is almost three times of the clask interpreter, but the memory consumption wise, the fast interpreter consumes more memory than the clask interpreter. The fast interpreter will do precompilation from wassen to extended bytecode. Here is the procedure. So in the left side, we can see a few WebAssembly codes. So basically we just load the local variables into stack, and the add opcode will load the appearance from stack, and push the result into the stack, then add with another constant one, then set the result into variable zero. So we compile the two extended bytecodes, so we use a different way to store the appearance, so it's called slot, and we have a three-slot category, one for the const, one for the local variables, and another one is for the dynamic appearance. So after calculating the slots for each appearance, so in the new extended bytecode, actually the slot ID is following the opcode, so the first two are the operands to be operated, the second one is the result, either the slot to store the results. So after the conversion, so we can have a six original wassen code becomes two extended bytecodes. The ahead-of-time combination in the loader, the WAMER supports LOT compiler WAMER-C, so you can use WAMER-C to compile wassen module to ahead-of-time module, which is already in native instructions. The WAMER module loader enable LOT module running in various target environments such as Linux, Windows, accessibility environments, such as Intel, S-Jax, and even in the MCU, micro-country units. So with the ahead-of-time combination, we actually have a multiple path for deploying the LOT module. First one, you can just distribute the wassen bytecode and do the compilation on the target. So if you are running on Linux or Windows, you can do that. Or you can use the cloud-based distribution. So you can have a WAMER-C on the cloud. So before distributed to any specific target, you can compile it in the cloud. Or you can directly do the compilation in your development environment, so you can distribute the LOT module together with the software package. The execution in place for LOT module, or we can call it XIP. XIP supports execution of an LOT compiler module from Flash. That will reduce the memory usage by loading the module into DRAN. To use the XIP feature, you will use the parameter provided by WAMER-C when compiling the LOT module. So you will enable indirect mode and disable LVM in transects. So to support the XIP, actually there are some special data in the WAMER. The major goal is to avoid patching the module for calling functions in the host, because normally when you're loading LOT module, you need to patch the calling places. So in the LOT module, we designed a map in the module, which is from the function index to the symbol name. And also the call in the module to function hosted in the runtime environment is through index. And in the runtime, we have a map from the symbol name to function address. Also, when we load a module, we need to build a function table in the memory that map the function index to the point. So we also provide recommendations for the development working flow. So basically, if you want to build some product with a good IDE, you probably can have your own IDE based on either Eclipse or VS Code with your IDE plug-in. And the two chains, including the SDK, header files, library, wasm, two-chain, and the WAMER LOT compiler, sometimes you want to pack it in a darker image. Then on the development environment, you can have a simulator. So when you finish the coding, you can directly load the build wasm into simulator. The simulator will have all the API supported. So even those API are totally different implemented with the target environment, but the simulation can be no problem. So when you finish the development building and the simulating, you can load the wasm binary into a target environment. So in the target environment, you can see the API library is supported, and then it has a runtime and a loader. Here is a demo for showing the source debugging of WebAssembly applications. We already have a VS Code opened here, and if you see source codes are already written in the VS Code, let's start with source codes. Go to the extension view configuration dialog for setting the parameters. Here we can see we have a darker image installed locally. All the building two chains are installed in the dark image. Click the build to start the building process. When it is finished, we can see the generated wasm file and lt file here. Then go to the extension view. Start debugging. We can see breakpoints around. Go to another breakpoint. We can also see the variables and the values in the debug view. That's it. The Wanmer VM Code provides APIs for building customized application framework. Normally, this API, including help the native word, calls the wasm functions. Also, help the wasm word calls the native APIs. In the right side diagram, these two interface provide the intercommunication between two words. The Wanmer already provides a synchronized application programming model, which is very useful for the IoT devices. In this model, it can support multi-applications and intercommunicated through the queue and messaging. Every wasm application has its own sandbox and a thread. They have a running loop and a posting message to others. Every application has a system-defined callback onInit and onDestroy. Those are executed by the application framework. It also supports the remote application management. You can install, install, and uninstall the applications from remote. Here is the sensor API and a sample. Let's have a look into the sample first. In the function onInit, we define a sensor object. The sensor object is initiated with a calling sensor open. The name of the sensor, the parameter, and the sensor event handler. We will configure the sensor frequency of the sampling and others. When the sensor event arrives, the sensor event handler callback will be triggered. You can get all the sensor data from the parameters of this function. When this application is uninstalled, the onDestroy will be executed by the framework. In the right side, there is the internal architecture. Basically, we have two words working together. A number of functions will be called when the board is initialized. The sensor framework will initialize the active physical sensors one by one. That will create some sensor nodes in the memory. When some applications in Watson try to open a sensor, it will eventually call the native API Watson sensor open. That will create a client node linked to the sensor. Then data is read. Repeatedly sample the sensor and post the event to each client attached to this sensor. The event will be turned to calling the onSensor event Watson function inside the library. The library will find the caller and call the event's callback. Let's look at another sample here, which demonstrates the inter-app communication in PubSub model. Even it can support the WebAssembly communicate with the remote devices. The next site is the source code for the Subscriber. Simply, it just calls the API Subscriber event in the onInit. The first parameter is the topic of the event, alert overheat. The second parameter is the callback for the event handling. The right side is the source code for the publisher application. It just creates a repeated timer in the onInit. Every second, the timer one update function will be triggered. And in the function, it just posts the event on alert overheat with a calling API published event. Some IoT devices may have touchscreen designed. You can program Watson apps to provide graphic user interface. The one provided two samples with a little video, which is a 2D graphic user interface library. In the right side, the above one is the Watson app runs on Linux with a simple direct-media layer. The one below is the same Watson binary runs on the STM32 board with a physical touch LCD. The two samples demonstrated, you can either build the little video 2D library into the host software or build it into the Watson bytecode as part of the application. Now, let's have a quick start. On Wammer, there are a few steps. First one to set up the building environment. So you can follow the command below to get all the building tools environment available. Then download Wammer repository from the GitHub. Then start to build the mini product for runtime and also run the Watson module on that. Build the mini product I-Wassam. You can follow the command here. It will generate a binary file I-Wassam. Then follow the commands here to build the Watson module from source code. Then use the I-Wassam to load the generated Watson module and then run it. In the picture in the right side, so we can see the output from the Watson module. The hello world is printed and also print the Watson micro runtime. To use the AOT compiler, first you will need to download and build LLVM. Then build the Wammer AOT compiler, Wammer C. So when Wammer C is available, you can use the Wammer C to compile the Watson module into AOT module. Then you can just use the I-Wassam generate in previous page to run the AOT module here. If you have some interest to try the Wammer, here we show some basic usage. How to load a Watson binary and call into the Watson function from your host software. In the right side, basically you will first initialize the Watson runtime by calling the Watson runtime init. Then load the Watson binary into memory. Then you probably will register some native APIs. So those APIs can be called by the Watson applications. Then you load the buffer into a module. Then you will call the Watson runtime instantiate to create an instance of this module. Now you are able to call the function inside the Watson applications. First you need to find out the handle to this function by the symbol name, which is a fib here. Then you can create an execution environment by defining some stack size. Then you will call the Watson runtime call Watson into the target function. The parameter will be the execution environment and the function handle. The parameters you want to pass into the Watson function, which is in array. If the function failed, you will be able to get the failure from calling Watson runtime get exception. In the other direction, we may need to call native functions from the Watson module. It is very easy in one way. In the runtime, you just register a number of native functions by calling Watson runtime register. Each element in the array for registration contains the function symbol name, the native function address, and the function signature. The one has extended the signature a bit to make the process much simpler. On the right side, the signature letter star means the parameter is a pointer in the Watson module. So the runtime will do the address space conversion. The letter tiles must follow the star, means the length of the buffer pointed. So the runtime can do the boundary check. The letter dollar means the parameter is a string in the Watson world. There are more useful reference links listed here. Such as build the source code in Watson binary, embed warmer, the warmer head files, build warmer, export native API to Watson, and the basic sample. It is all for the topic WebAssembly Microruntime. I would recommend you to download warmer and try it out. If you have any problem, you can post an issue in the GitHub. I'm sure your issue will be answered very quickly. Thank you. Bye.