 Hello everybody, my name is Jin Huang. I'm working on industrial automation system performance optimization over a deeply embedded device. And here is Louis, a collaborator with me. So in this session, we are about to discuss the construction of a secure microphone for deeply embedded devices, as known as IoT. You know, IoT is just nothing, just password, you know. We introduced a new open source implementation, then after F9, microkernel. It is envisioned for low-powered devices with security design in mind. So an American expert says security is not a product, but a process. Do you remember OpenSSL? It happened three years ago. It's a nightmare. And let's check out various known firmware attacks related to IoT security. So the first one is in 2013, a researcher at Georgia presents MAK10s which inject malware into iOS devices by modified USB charger. And another case, in the internet near, the researcher at the SR lab modified firmware or USB devices which do something evil, including first, to emulate a keyboard and issue command on behalf of the login user. Second, it spooned a network card and changed the computer's DNS configuration to reiterate the traffic. So you might say, how come I'm not so stupid? But according to another research, user will pick up an plug in USB flash drive after they find it usually. After the USB drive is connected, the evil firmware will be activated. This is the reason why we should always use, you should take the secure consideration in mind. Last year, I announced the availability of NV-8M, which provides truss room extension. It can enable bus-level protection in hardware. NV-8M code can fill the requested DMA access initiated by unprivileged code on bus-level. For NV-7M, it requires more software API. NPU, which stands for memory protection unit, NPU banking in NV-7M reduced the design of secure operating system. And the secure OS partition on a private NPU with full control and OS keeps the privilege more for fast IRQ. So why do we need another kernel? The reason is the consideration of trusty computing base. If TCP gets smaller and smaller, we can focus on the detailed verification on security and further attack. So as you see from the slide, bigger kernel, bigger and bigger kernel always suffer from weakness and uncertainty. Linux is the best example to illustrate the criteria risk from software quality. The micro-kernel approach, as you already know, tries to eliminate the problem arising from big kernels. By strictly following the principle, which implies nothing allowed by default. Capability is a communicable and forgettable token of authority. It refers to a value that refers to a value that references an object along with a social set of access rings. A user program on capability-based operating system must use capability to access an object. In other words, micro-kernel should be minimal with hardware-enforced separation. It is interesting that the design commutation between interface first and implementation first is interleaving. So you can see, first is Maticus. Maticus itself is interface first. And the units, micro-kernel and Linux, maybe the last one might be SEO4, which is interface first. But the commutation is still going. So the mechanism inside micro-kernel are the only trusty components, including Azure Space, IPC and Threat Management. All other services are down at the user space. There are three generations of micro-kernel development. The second generation is SEL4 and the third one is SEO4. You can forget the first one, which is poor performance. So let's start from the second generation. The second generation of micro-kernel is known for L4. And L4, I mean the implementation of L4 is a family. It's known for be fast because of smaller cache footprint, which implies much faster IPC. L4 is not exactly an implementation, but a family of implementation. SEO4 is the most advanced one licensed under GNU GPL. L4 is widely commercialized by Qualcomm and Open Kernel Lab. The typical use case of L4 is to over-isolation from software in other cells so that existing software components can be reused in new design. Furthermore, the compressibility of the dispatching multiple operating system workflow causing multiple physical CPU might be reduced by means of micro-kernel-based hypervisor. The migration from the generation to the third generation micro-kernel is to resolve the following problem in improper memory abstraction which resulting in the possibility of denial of service attack. SEO4 is a notable implementation of the third generation micro-kernel. And F9 is the new micro-kernel designed for IoT or different embedded devices. The design constitution of IoT are first to support dynamic nature of IoT devices. Second is to prevent the acceleration of data while traversing the network. The third is the advanced power management. Another requirement is over-the-air update firmware mechanism. The switch to a new version is only operating when the newly downloaded content is fully ready. So what does F9 provides? F9 itself is free distributed in BSD license in 2-closure license which allows commercial use. And it's optimized for on-call SM calls by means of fast IPC and well-structured design. Security is an important consideration as well. In addition, F9 provides a flexible development mechanism inspired by Linux K-Probes in order to perform, provide direct optimization, PDO or PGO. F9 micro-kernel attempts to eliminate the risk of existing unavailable systems by small-tips TCP on the five patches and solid system isolation. The next generation of device F9 micro-kernel is named after BSEC, implement as a third generation micro-kernel. The architecture looks like the diagram where F9 micro-kernel acts as a hypervisor to isolate the two domains. One is the untrusted domain where all applications runs, I mean the normal application runs in untrusted domain. And another is the trusted domain where everything is fully very dated. F9 follows the fundamental principle of F9, an L4 micro-kernel. That is to implement Azure Space Thread Management IPC only in the previous kernel. And the IPC is almost built from scratch to facilitate the features on CodexM series, including MVIC, which stands for the Nasty Interrupt Support, BitBending and NPU memory protection units. Memory management is split into three concepts. First, memory pool represents and manages the area of the physical address space with space attribute. Second, Facebook page describes an always line region of Azure Space. Unlike other L4 implementation, Facebook page in F9 represents NPU region instead. And the third is Azure Space, which made out of these Facebook pages. Interrupt handling in F9 can be handled in both kernel thread or user-level thread. To improve power efficiency, F9 introduced TCLIS kernel, which results better kernel consumption than the common approach driven by system ticker. The idea of TCLIS kernel is that CPU only wakes up by event instead of period article timer. So let's compare. Think of the CUNTEX switch OHA. This is the diagram illustrates OHA in CUNTEX switch. And TCLIS scheduling can avoid timekeeping OHA. How can it do? So TCLIS kernel, I mean in F9 implementation, it entered TCLIS right before going into CPU idle state and set interval of next timer, interval of the next event. So you can see if your timer is period article, your interval handle is wake up periodically. It consumes more energy. But if your kernel is only driven by external event, such as network, such as touch screen event, it can gain more reduction in energy. Also, I would like to introduce K-Pro implementation in F9. K-Pro is a dynamic implementation inspired by Linux kernel, allow developer to gather additional information about kernel operation without recompiling or reporting kernel. K-Pro in F9 can be used for OTA update or remote debugging purpose. At the moment, F9 provides partial party support and some system profile so you can monitor the system resource and figure out the location you want to optimize. F9 MicroKnow is commercialized by Genesee company, which is based in USA. Genesee company uses F9 to provide smart solution for IoT called Redis K1. The communication between device and server is based on web support code and the data exchange is encrypted. So you can see the device is based on the codec M, which is very efficient in both power and security design. Bistec is the next generation of F9 MicroKnow. We learn from MicroVisor, which is a sub-project of an embedded operating system. We implement F9 MicroKnow in Bistec as a lightweight alternative. Bistec facilitates the memory protection of codec M, including advanced features in NV8M I mentioned before. It's heavy-inspired by SEL4, moving other mechanisms with security enhancement. The security enhancement and the facility show outside the kernel. Its target is the low-power IoT devices, and it's very lightweight. You can see only 2,000 lines of code. Bistec uses the capability or key to manage the resource, with the add or object reference and a set of ring. Bistec only provides a three-system code, which consists of send, receive, and yield. Almost the same as SEL4 does. So I'm going to represent the memory protection mechanism. Hi, I'm Louis, and I'm going to take the MPU-101 here. Because F9 and Bistec intend to be our un-visimum M, in un-visimum M, there isn't any MMU at there. So you didn't have to take code to use. But there is a protocol of MPU, memory protect unit. And we can use this unit to provide basic memory management for Cortex-M application or kernel. So on Cortex-M4, there is 8 regions you can set on it. And on M7, you have 16 regions you can set on it. And every MPU region, you can set some feature on it, like the TEX capability, or the access permission for the user, or not user, for the privilege or for privilege. And the most important thing is you can set this execution never bit. And the execution never bit is to represent one thing is if you enable this bit, this region will not take as executable memory. So you can prevent some attack like the buffer overflow attack on it. Okay, so there's on Cortex-M, there is the memory region is flat. So for example, Cortex-M4, they will have 4 gigabytes of the memory, like the diagram. And so now if we consider there is a bit that is running now, and the PC register is at this address. If an attacker puts the malicious code at the 2-0-0-0, 4-0-0 address, and it use anyway to set the PC go to this address, then the CPU will try to fetch the next instruction at this address, right? So if this is done, then the attack is bingo. But today if we set in the NPU region, I set it to the not execute like the yellow part of the memory. So NPU region we can set the base to this address. And we can see this place for example is a peripheral for maybe serial or other things. So we can set the size is the power of 2 to 12. There's a restriction of this is that we can only take the power of 2 size for the memory protect size. And so in this case the range is from 2-0-0, 4-0-0-0 to 2-0-0-0, 5-0-0-0. And if the attacker tries to fetch the instructor in 2-0-0, 4-0-0-0, 8 as the next instruction, in this time NPU will catch this invalid memory access and trigger NPU to generate a memory managed fold exception. So this time attacker won't execute this code but captured by the memory managed fold handler. And then we prevent the attack on this situation. And we can use the open OCD and GDB to simulate this problem. So we use open OCD and enable GDB by target stand to the port and monitor set similar host to enable reset and load it again. And we can predefine some memory address to the NPU. Then we can set the attribute to NPU register and using open OCD provides the instruction MWW, the memory write award to this register address. And we can redefine the PC by using MWW to simulate the problem I mentioned before. So we can then see the result on the GDB that it was caught by the memory managed fold exception handler. I'm not familiar with the make so I need some time. This is here is the case study for the RTOS integration. Here RTOS stands for the implementation like for RTOS or other implementation for example on embed operating system. So in the case study we will show how B-Stack hypervisor works. The integration is not RTOS API emulation layer or simulator. Instead it runs actual RTOS code derived from the on port including the schedule. And RTOS system layer implement some mechanism. For example the first thing is the location and the deletion of operating system object like task queue and heap. And also it has to provide mutex with priority inheritance to ensure the operating timeout and time slicing with pre-emption. And as far as you can see from the diagram there are two contexts in RTOS and B-Stack integration. The first thing you have to consider is the task context and another is interact context. So think about the scenario that the interact context how the interact context works. First B-Stack has to model thread execution code used to run RTOS. Also it has to be tweaked to run the interact handle code such as ISR. And finally it provides some virtual interrupts. So the virtual interrupts itself is the message model in the form of the supervised code. And there is system gates introduced in NV8. So NV8M so if you have your hardware is based on NV8M B-Stack itself can directly use SG. To be the base of the hypervisor. If your hardware is not NV8M for example NV7M such as codec M3 or M4. B-Stack provides the emulation code for the system gates. RTOS first sends B-Stack IP through SG in order to request a context switch to enable or disable the interrupts. So let's check the diagram again. So you think the existing RTOS like free RTOS is already implemented. And B-Stack will act as the hypervisor. It provides some virtual interact facility to perform the virtual interrupts as well. And the message dispatch loop multiplies the interact context. So let's make some conclusion. Minimize TCP is feasible for building secure IoT operating system. And L4-based design brings the software isolation mechanism. It is known to work and already used in some commercial products. Codec M4 processor enable real-time availability and memory protection facility to develop hyper-most local sub-platform. And F9 MicroKernel already takes this advantage to build efficient hypervisor and RTOS infrastructure already. So we would like to ask the developer to put some attention on F9 MicroKernel development and share some use case, successful story and simple request to us to input F9 MicroKernel. So do you have any question? What are the benefits of starting a new code base and not taking a CO4 as your starting point and tweaking it? Okay, that is a good question. The question is why not just tweak SEO4 for the IoT OS? SEO4 itself consists of more than 9,000 line of code. But for the devices like codec M3 and M4, you have limited range and fresh storage. So it is almost impossible to use SEO4 directly. But we can take advantage from SEO4 such as capability system. So you can deploy the fine-grained protection for given resources with timing consideration, which is quite important for IoT devices. Also the SEO4 code is generated from the executable specification. So is there a possibility to somehow change what is being generated, what components or maybe what parts of the kernel are being generated to make that thing more? I'm really just asking. Yeah, the question is interesting. Because as you say, most of the code are generated by SEO4 preprocessor and some formal models. But it's quite difficult for us to maintain. So we try to put F9 to list file. It takes about two weeks. But for SEO4, I cannot imagine the feasibility. Sorry? Oh, yeah. I couldn't say the... Because I have a sound-to-sound NDA. So I cannot say the exact model. But I already worked on the simulation environment for the face-model-based system. So the working prototype already works. So I can show you of time. So if you use on this 8M, it has a mechanism like system gates, which is quite efficient. You can eliminate the extra... I mean, you can eliminate additional ejection without frequent contact... I mean, the ejection return overhead. So if you lack of the system gates, you have to implement the switch by just normal ejection handle. Any more questions? Okay. Okay, nice to... It's my pleasure to present F9 MicroKono and the incoming BSEC. There are some interesting differences. So you can check the first one. It is essential material discussing about the migration from L3, L4, and SEL4 in 20 years. And the second one is the lecture. I think the tutorials are quite interesting and informative. And if you are interested in the Tiklis Kono, you can check the last one. We have another presentation. You can find it on Slideshare discussing about the Tiklis Kono K-timer and how CodexM works. Okay, thanks for coming. Thank you.