 Hello, everyone, and welcome to my talk. Today, I'd like to talk about RaspBase, Secure, and the right-wing container runtime for embedded systems. My name is Manabasugi Motor. I'm a system software engineer at Sony Out of the Center. My research interests are container virtualization, unique kernels, and Linux kernel. Here is the outline of my presentation. First, I'd like to start by talking about Linux container virtualization and the container runtime software. Second, I want to share how we have managed these containers on embedded systems under the problems of existing container run times. Then, let me introduce our RaspBase Secure and right-wing container run time optimized for embedded systems. After that, I will show the abbreviation of our RaspBase runtime. Finally, I will talk about the future work of our runtime under the conclusion of this presentation. So let's start with the first introduction. Linux container provides isolation and containment for applications. And the mechanism can prevent attacks from interested applications. Recently, the container has been utilized increasingly in embedded systems because the techniques is more attractive to resource-constraining the systems due to the lightweight. These containers are created by the software called container run time. What is container run time? Container run time is the software that spawns and runs containers and is responsible for the mechanics of running the containers. The container run time such as run systems sometimes are referred to as rollable run times. The container run times are compatible with also a runtime specification to be able to receive a request from higher layers such as container D. RCI stands for Open Container Initiative. As you already know, containers are implemented using renex named spaces, C groups, capabilities, and so on. So the rollable container run time are responsible for setting up these renex features for containers. This slide shows the software stack of the container run time. Red suppers users create containers by Kubernetes. First, Kubernetes communicates with a high-level container run time such as container D and cryo through CLI. CLI stands for container run time interface. Second, high-level run time produces images from registries, manages them, and hands over to a rollable run time through all CLI. Lastly, rollable run time such as run secret and launch the containers using the renex features. So today's topic is here, rollable container run time. Now, I'd like to talk about the motivation of our study. First, I'll talk about the requirements of embedded systems. The main difference between general purpose systems and embedded systems is that embedded systems have more restrictions than server systems. In resource-constrained systems, the memory size is small, storage capacity is low, and the CPU is not high. In the case of mission-critical systems, rare time applications run with critical functionality on the systems, and the systems are longer life cycles than servers. So we needed to manage containers while meeting these requirements. How we run container run time on such embedded systems? It is difficult to use Kubernetes or Docker on embedded systems because those software includes performance overhead and a high-resource usage to manage containers. For example, Kubernetes and Docker include a high-level run time that manages container images. In the case of the mission-critical systems, we cannot ignore the overhead because response time on the systems should be high for rare time applications. In addition, the right operations are shot in the right span of the EMMC that stands for embedded meta-media curve that is a standard specification of embedded memory. Please look at the figure on the left. A demo service such as Docker always writes to metadata files in mounts on EMMC of embedded systems. We want to avoid the right operation as much as possible to extend the right span of the EMMC. To solve these problems, we run a lot of container run time along on the systems. In the figure on the right, container run time runs without any services. In this way, we can manage containers effectively on the systems. However, the existing container run time should have some problems in terms of security and the right way when we run the run times on embedded systems because run times are not optimized for the systems. First, in terms of security, Linux capabilities are not fine-land access control. When a user wants to use a ping command, the user needs to have a cabinet row capability. However, cabinet row also allows the user to run up spoofing attacks. Second, the rootless container by username space is very strict for the systems. The username space allows the containers that are privileged to outsize the name space to have root privileges, while at the same time limiting the scope of that privilege to the name space. However, the rootless container cannot emulate all the system code because the containers in the username space cannot manipulate the global resources in the systems. In the case of the embedded systems, it is not good because some embedded applications needed to access devices via such mount system code. Second, in terms of the right way, container setup time is not fast enough for your time system. Most container run time such as run C are written in Google language. Google language is very good, but unfortunately the application binary size is big for the embedded systems and the garbage collection by go run time increase high CPU utilization. So we have to solve the problems in embedded systems. Okay now, I'll talk proposal to solve these problems. Here, I propose SL run time that is the root space secure and the lightweight container run time before embedded systems. SL run time is implemented fully in Rust with the modern flights and also a compatible minimum of container run time for embedded systems. SL run time is a locally divided into a secure mechanism and the right way mechanism. In the secure futures, isolation by container for high dependability. The SL run time has a fine landing access control and the memory safety by Rust for security. In the lightweight mechanism, raw memory usage and the smaller binary sizes from the benefits of the Rust. And the first setup that is our original features and the real time support for embedded systems. I'll explain these mechanisms in detail later slide. Here, you can see the comparison table of SL run time and the existing container run times. SL run time is more lightweight and the secure than their existing container run time. The finally size of SL run time is 2.63 megabyte. So that is much smaller than I'll see. Container run time replica developed by Oracle is also implemented in Rust like SL run time. However, the development was stopped in 2018 and the repository has already been archived. The Rust futures and the cries used in Rust have been outdated because Rust is the first growing language. Compared to Rust, SL run time is the latest Rust based container run time. Why did we choose Rust over GoRang or CC prep test to develop the container run time? The answer is that Rust is a great fit for embedded systems. First, the performance is equivalent to CC press press. Second, Rust guarantees memory safety without garbage collection. Third, the Rust community has awesome place for developing the container run time. I'll introduce this phrase in the next slide. Lastly, Rust FFI that stands for crawling function interface is very helpful to bind the next API. GoRang adapted by many existing container run times is also good but has some limitations in embedded systems. The goal has a problem interacting with name spaces by go run time. In addition, the application finally says it's big compared to Rust and the Go base run time include the performance overhead by garbage collection. Here, I'd like to share the clites for developing the container run time. Rust has already a useful clites for creating containers such as capability or remit, cgroups, seccomp, and things like that. Passive leak phrase is used for the fine-glued access control of SR run time. And the co-affinity is used for the SR run time where time support. As you already know, we can develop the software easily using Crab start JSON pretty maps or CI of JSON format into Rust data structures. This slide shows the SR run time architecture overview. In the next few slides, I'll just show the right way to mechanism and secure mechanism. First, right way to mechanism. I'll explain the first startup feature. The first startup run choose a container speedily by leveraging a pre-created container. By using the first startup, SR run time can omit time before initializing the run time and creating the container. First startup replaces only the execution process inside the container at a startup. So the run time can be used as a configuration except for the execution process. This feature is useful because some containers use actually semi-configuration, such as the name, space, capabilities, and things like that, except for the execution process inside the container. Please look at the figure on the left side. Now, semi-containers are created. These containers do not run yet. After that, when you need to run a real-time application, SR run time replaces the dummy process with the real-time application that you want to run. Please look at the figure on the right. The normal run has initializing run time, creating a container and starting the container phase. In the case of the first startup, you can reduce the time of initializing run time and creating the container. So it is possible to run the container first as a normal run. In addition, SR run time has a real-time support feature that enables the run time to set a CPR affinity at a first startup. By using the feature, users can set a CPR affinity depending on their CPU load at a startup. Here, I'd like to talk about the control flow of the first startup and the real-time support. Red supports a user creates a container by SR run time. First, the user prepares the container configuration file, configured to JSON in advance. The configuration describes the dummy process as an execution processing inside the container. After that, the user issues create operation and SR run time initializing to set a container based on the configuration. At the end of this phase, the container collection is complete and all settings about the container initialization are complete. At this point, the container goes into a create status and the SR run time monitors a file descriptive. Then, when the user wants to run the container using a first startup, the user creates the first startup to JSON, describes a real-time process to be executed inside the container on the CPU core that executes the process. Then, the user runs the first startup and the SR run time writes the contents of the first startup configuration. At this moment, since the container that is waiting can read a file descriptor, the SR run time causes the container to resume execution. So SR run time sets the process and the container runs the execution process. Now, let me move on the secure mechanism of SR run time. I will explain the fine-land access control. This is here. SR run time, the fine-land access control enables the routeress container to execute the system core safely. By using this fine-land access control feature, the routeress container can emulate even system cores that changes global resources because the fine-land access control is able to emulate the system core in a user space on behalf of the container based on the security policies that are set in advance by the administrator. Please look at the figure. Now, we have the routeress container A and B and the fine-land access control server with the security policies that allows container A to mount tempFS and the profits container B from doing mount tempFS. If container A doesn't mount tempFS, the server catches a mount system core before executing it, checks whether the destination of the mount is tempFS under the performance amount on behalf of the container A. Thanks to this mechanism, the routeress containers can issue the mount system core safely. If container B does not, does mount tempFS, the server denies the mount because of the security policy. This fine-land access control mechanism is achieved by using the new setcom notify feature. Here, I will explain the setcom notify feature that is introduced in Linux 5.0. The setcom notify provides a way to handle a particular system core in user space. Now, in this example, we have a container and a setcom agent which handles the system cores on behalf of the container. First, the container will issue a system core. Second, the setcom catches the system core and executes the BPR program and the BPR for return to notify. After that, setcom asks the setcom agent to want to run the system core and the agent makes a decision on whether the container performs the system core. To make that decision, the setcom agent reads the system core arguments and validates the system core. If okay, the agent performs the system core on behalf of the process and otherwise rejects the system core. When the agent can successfully execute the mount system core, the agent sets the return value and return it to the container. Here, this slide shows how their experiment time implements fine-gland access control using the setcom null file. First, a system administrator launches a fine-gland access control server as a loot before starting a container. Second, user runs the container using config.json that describes the setcom null file. A few months ago, also a long time specification added the setcom null file support to use it in containers. In the configuration file, BPR specifies the path of the Unix domain socket which is used by setcom null file action. The user will collect the root or container using with the configuration. After S-run time receives the create request, the run time initializes the container and creates the setcom null file descriptor. Then the run time passes the descriptor to the server. I will show the demo of fine-gland access control. Our config.json describes that execution process is shell and the user names base setting and their setcom null file configuration. In this demo, I limit mount system call. Please look at the top right. First, I run the fine-gland access control server that allows the container to amount only when the destination is food directly. Second, I run the root-res container for a long time. The user ID is not root, is root inside the container, but on the horse machine, the user is not root. Now, if it's not mounted, when I try to mount bar, the mount fed to because destination by is not allowed to the server, you can see the error message in the server. However, if I mount food directly, the mount was successful. Or we can confirm that food directly is mounted correctly inside the root-res container. Now, let me move on the evaluation of SR runtime. The evaluation goals are measuring two types of start time, normal run and fast setup. There's a memory consumption of the container runtime. The existing runtime is that we use in this evaluation run is to see singularity, run C, student and write girl. In the experimental setup, all the runtime is used as a semi-config.json. We remove C-groups configuration because SR runtime does not support it yet. Then we run the container runtime alone without any client tools. The container execute true command inside the containers. The result of a start time shows that SR runtime is the fastest among the existing container run times because the run time is minimal. Please look at the graph on the left side. The normal run of SR runtime achieves 7.4 times speed up compared to run C. The writer girl is also rest-based container runtime, but SR runtime is matched faster than writer girl. Please look at the graph on the right side. Faster setup time is 5.1 minutes again, and faster setup achieves 1.5 times speed up compared to the normal run. Here, this writer shows the memory usage of the container run times. The result of the memory usage shows that SR runtime is 3.84 megabyte and it is smaller compared to go-based or container run times. The important point here is that SR runtime's memory usage is equivalent to C run written in C language. C is the most preferred language for embedded systems, but RAS is also a great fit for the systems. Let me move on the RAS section summary. Here, I will talk about the future work. First of all, we need to make SR runtime fully compliant with the OCR runtime specification. Currently, SR runtime does not support some features such as C groups, OCR folks, and things like that because SR runtime is our research prototype. Second, we need to enable Kubernetes to use SR runtime. Lastly, we plan to integrate SR runtime into Cata containers because Cata community has already developed their container runtime in RAS. Conclusion. First, RAS language is a great fit for embedded systems due to the small memory fit plane and binary size. In addition, RAS guarantees memory safety without any overhead. So we developed RAS-based container runtime for embedded systems. Our runtime has the first setup mechanism. They're finding an access control for embedded systems. The evaluation shows that the runtime launch is a container 7.4 times faster than run C and the runtime memory usage is equivalent to C basic runtime. This is all for my presentation. Thank you for listening.