 All right, welcome back everybody. So I just want to introduce our next session. This is Neoto Yamaguchi, but we call him Dr. Y, because he is the doctor of informatics. So we must bow to him. He's always first class. Take it away, Dr. Y. Thank you, Wots. Today, I'm talking about a case study for resource control in embedded Linux container integration. It's an agile, intimate class expert group activity. So at first, I introduced to the who am I. My name is Neoto Yamaguchi from Icing Corporation. So first information is Icing Corporation, so head office in Japan, Kariya City, ICHIP Pure Picture. Manufacturing and sales of automotive parts and energy and lifestyle-related products. My career is received a doctor of informatics in 27. And automotive, real-time OS, a proton software engineer since 27. And Linux proton software engineer since 2011. And one information is my company is Icing AW. But Icing AW is combined to Icing City. A new company name is Icing Corporation. So my history of the open-source community is joined to Azure in 2013. And I've been involved with Azure Inc. spent cluster expert group since 2019. It's my information. So this slide shows today's presentation outline. At first, I talked about the background, the overview of our target systems. Second section presents the technical concept. So this section talking about the technical detail for our Linux container integration. So third point is case study for the resource control. It's the main topic of this presentation. And finally, I will conclude. So first point is the background. So Azure Inc. spent cluster expert group launched 2019. So this expert group motivation is create a base platform for incident cluster, not a platform based on the conventional RBI. There are different system requirements between RBI and the cluster. For example, functional safety is required, boot time required, and another many requirements is required. So this platform is built from open-source technology. It's our big motivation. So this means a need to independent from existing proprietary software component. And a need to be able to choose open-source software component or proprietary software component. Oops. So this slide shows our generic system design. So generic system design for ICG works. So incident cluster and RBI into one systems, this one. And incident cluster is required to functional safety. It's a big differential point of the RBI. So characteristics for these systems. Connecting to one or more displays. Connecting the speakers. And connecting to phone by Bluetooth, USB, and Wi-Fi. This one. And connecting to USB devices. Connecting to onboard of both sensor devices, such as the I2C UART, SPI connection. This one. And connecting to other in-vehicle ECU by CAN, Incarnate, and et cetera. This one. It's our generic system design. So FAT, FAT has a product development issue. So this issue is a quality and a robustness. First point is the functional safety. So incident cluster is typically required to the SLB. So incident cluster has a telltale functions. This telltale shows a critical failure and information to driver. So next point is the quality management. There are separate quality requirements between incident cluster and RBI. So more detail is the next slide showing. So first point is the functional safety. So main function is a very function of our systems. This one. So it requires advanced quality management, open innovation, cybersecurity, and et cetera. Another part is the functional safety. Ancient as a vehicle safety. So this part is a FAT function does it including. So FAT is a safety function. Which operating system use the safety function side? And which communication method do you use? It's a big issue for our system design. So first point is we use isolation method between main function and the safety function. So this method is using the hardware separation or using the hypervisor, software-based hypervisor or not. So it's a highly dependent to your system design. So incident cluster approach is the first point is the safety and approach. And the main target is the main function side. So parts between RBI and the incident cluster. There are many parts in the automotive system. It's a RBI and the incident cluster. So RBI side require to the rapid innovation. That means a new future is added. Short-term development is required. And the rapid bug fixes are needed. On the other hand, incident cluster requires advanced quality management. That means a full-panced coverage testing is required to the software side. And the formal verification is required. And carefully bug fixes are needed. It's two software stack parcels. Another one is the various function requires the RBI side. So many play installs the application. And the application installs from store. It's a RBI software stack typical requirement. On the other hand, incident cluster side need the selected functions. It's a big puzzle for between these two software stack. So we define the QM isolation of method. That detail is our answer to the puzzle issue is one more isolation method, which takes one more layer to isolate the function by using the Linux container technology. So each software's development on quality management level. This part developed by the incident cluster level quality management. Another RBI side is the RBI level quality management is the requirements. Then all software integrated the mix into one big software stack. So their requirement will cross-propagate around one big software stack. Our approach is isolate each software stack using the Linux container technology. We call this concept is a QM isolation. So next is a technical concept of our integration technology. So at first I share the first information of the Linux container. So Linux container is operating system level virtualization method. So running a multiple isolated Linux system is a container or host using a single React kernel. So that detail is isolate a root file system on Linux kernel by using the change route. Control resource by using C groups and hide resource from other container by using the name species. Linux container is a strong point in embedded environment. So Linux container requires only a Linux BSP, not require to the extra BSP, such as the hypervisor or another virtualization method. So no need to additional virtualization driver development because each driver can easy to share each guests. And can integrate container based system over hypervisor. That means container integration stay on the hypervisor, low layer isolation method using the hypervisor high level isolation method using a container. It's easy to integrate. So technical detail for the QM isolation. It's more detail of the integration. Software stack isolation is a we think so should separate root file system between instrument cluster and everywhere. It's a QM isolation first concept. So instrument cluster side built by high quality assurance software will not change after SOP without critical fix. On the other hand, IVI is built by standard software. It will change after SOP to use upper grade function because typical instrument cluster software stack not increasing the Wi-Fi function, not having security issue and it's a low risk. On the other hand, IVI side typically having the Wi-Fi and another outside communication method. It causes many cybersecurity issues. So relax container realize root file system isolation. That means easy to upgrade IVI side user and binaries. So next is computing resource isolation. CPU shielding based isolation using the CPU group CPU set. It realized isolated scheduling class from instrument cluster container. We can use real-time scheduling isolated from other containers. That detail shows that this figures. So cluster budget assigns CPU cores to CPU cores using the CPU set C groups. IVI side budgeting from two to five cores. And memory resource isolation. Memory budget guarantee using the C groups memory. This detail shows that this figures. So Linux container strong point is easy to rebalance things. It's a strong point. Next is the display isolation and the sharing. Cluster and IVI have separate requirement. So cluster UI require to the high frame rate. IVI UI is difficult to control during timing for each of the apps this side. So QMI solution technology selected separate composter architecture. Cluster side having one composter and IVI side having the one composter. So it realized isolated composter frame updates timing between cluster and IVI. Final point is the height resource from other containers. IVI container has connections and connectivity device such as Bluetooth, Wi-Fi, LTE, and more devices. It's a cybersecurity point of view. This device should insert into the container and height to the other container. So it realized easy to realize using the network name spaces. And the mountain name spaces realized is a character under block device hiding. So it's a relaxed container strong point. So third point is a case study for resource control. So it's a more detail of the technical concept part. So target use case is an instrument cluster is a most high-level critical part of the disk system. Shall possible to use a real-time scheduling inside the container. So internal properties, priorities of own container should not affect other containers. And shall protect a budget from CPU overload at the other containers. That means overloading. If overloads the IVI container size, shall be protect cluster container side CPU budget. And shall keep stable operation operating environment. So avoid the co-migration in heterogeneous CPU topology environment. So this figure shows the effectiveness environment. It's using the R-Core 8.3. It's Azure rehearse hardware. In this case, so for high-performance core is having. And for high-performance core, low-performance core is having. In this case, so should be set the high-performance or low-performance core shall to keep a stable operation environment. So our design goal as shown. So requirement is a shall push to use a real-time scheduling inside the container. So development item is a real-time capability support. Why need this point? So instrument cluster require to real-time scheduling using to keep keeping the frame update rate. And IVI side require to use a real-time scheduling to use, for example, audio management and the mixing. Because this part is a typical high rate and low-latency operation is required, typically need to require to the real-time priorities. So existing knowledge of this design, so Linux kernel is already supported the real-time scheduling. In container environment, it is possible to use a real-time group scheduling that realize real-time budget control. And this functionality provide by the CPU of C groups. So next point is this selection. So requirement is a shall protect CPU budget from CPU overload from other container. And shall keep stable operating environment. So this point is a development item is a CPU shielding technology support. If I need this point, so the instrument cluster function must be protected other functions in case of other container overloading by DOS attack and non-bugs and other issues. When the execution context migrate from high-performance core to low-performance core in heterogeneous CPU topology environment, that execution time will be getting wrong. So existing knowledge is the most easy way for computing resource control is the CPU core, the CPU core binding to guest container. It provides a power container real-time property design, not require to the system-wide priority design. Because a system computing resources divide and isolate per core by schedule. Cluster normal priority thread is not overlocked by the high-priority thread. This functionality is provided by CPU set subsystem of C-groups. So this slide shows the detail of the key technology. So what is the real-time group scheduling? So it's a part of the C-groups feature depend to this configuration. So this feature are real-time to CPU time allocation for each real-time groups. For example, a total CPU time is a parent group. And it's 100% of the systems. And this figure defines the three sub-groups. 50% allocates the real-time group one and 30% allocates the real-time group two. It's easy to configure it. So this feature are real-time to CPU time allocation for each real-time groups. That allocated time can be used as exclusivity of by own groups. The other real-time group cannot use this budgeting time. But it has a limitation. The total of the real-time budget must be set lower than 100%. For example, group one is 90% and group two have 10%. It's okay, but group two is 50% is not okay. It's a limit of distraction of this feature. So next is CPU shielding technology. What is the CPU set it require to this kernel configuration? It's a part of the C groups of the future. It provides a mechanism for assigning to a set of the CPU and the memory node a set of tasks. This case study use only a CPU side. For example, total resource is for cores. One core assign to the CPU set group one and the two core set in the C groups two. So in case of the LXC based container, this CPU set groups is created per container. So this slide shows that resource assignment design goal. So core assignment is a one high-performance core assigned to the instrument cluster side. Two high-performance core assigned to RBS side. Other cores don't care. Real-time budget is an instrument cluster get a real-time budget 12.5% of the system. This one. That aim to use fully one core one core resources. So I've assigned get a real-time budget 25% of the systems. That aim to use fully two cores. And the core assignment is this solution. So our initial configuration is this solution. So CPU core zero assigned to the cluster and the IVI two cores. So kernel sketched real-time payload setting the disk parameters. And the real-time budget distributed disk parameters. So evaluate to the initial configuration. Evaluate to the initial configuration is a condition is a real-time inflate loop thread run in the cluster just to calculate CPU usage. Get CPU usage by ProcPS top command. First result of the test case one. This one. So RT inflate loop thread limited 12% of the CPU usage. Not cannot use 100%. Second test case is this right shown. If real-time budget fully assigned to the instrument cluster side, in this case so configuration is this one. That inflate loop thread possible to use 100% of the CPU but cannot use the other CPU, the real-time priority budget. Because it's a limitation of the RT sketched group scheduling. The issue of this evaluation result. The maximum RT budget of the C-group is 100%. Total of all guest budget must be set less than 100%. When cluster guest require to the 100% budget of the one CPU. Other guest can't use the real-time scheduling. So consideration for the evaluation result. So CPU RT payload setting means total payload of the all cores. It's expected to be heavier. And RT line time microsec indicated the total line time of the all cores. But actual behavior is another behavior. So RT payload is the indicated total payload per cores. And line time is a similar per core limitation. So that means so RT group scheduling is conflicting the CPU shielding design. If we use the CPU shielding design there are RT scheduling only to use our own guest container. So how to fix these issues? Consideration for RT group scheduling. When we do not use the combination with CPU set CPU. It's a good solution for RT scheduling configuration. So per core runtime restriction is equal system runtime restriction. That is achieved the combination with core migration. Detail shows this a figure. When we use it combination with the CPU set CPU. It's not a good solution for RT scheduling configuration. That means so CPU shielding using the CPU shielding technology cannot migrate RT thread to other cores. That's detail as this figure shows. As a result, our new strategy is not use RT group scheduling features. So this ratio is a second configuration of our design. Core assignment is similar to previous design. A real-time budget only to set the system-wide limitation and disabling the RT group scheduling feature. So not set the copper container real-time scheduling restriction. And free from restriction for RT group scheduling. That means all subgroup budget need to have less than 100% this one. So evaluation to the second configuration. So condition is similar to previous evaluation. So result of the test case three. So RT infinite group thread possible to use a 100% of the CPU usage. RBSR is a similar. And the test of case four, it's a sketched RR based infrared loop because I want to run the two real-time scheduling, real-time scheduling best thread. So RT infinite loop thread can use the 5% of the, 1.5% of the one CPU. And one more thread can use half of the CPU usage. So under this result shows RT infinite loop thread did not affect other guests. So evaluation result is this configuration realize our design goal. So configuration for the evaluation result. So expected behavior for new design, as I shown. So all CPUs are restricted, only a kernel sketch RT payload, micro-sec and the schedule runtime, micro-sec. I guess container configuration is free from restriction of the RT group scheduling. That mean possible to set 100% real-time budget of each guest container. So actual behavior for new design, cluster guests and other guests possible to use a 100% of the own CPU course. So final consideration. So after the disabling the config RT group sketch, guest container configuration is free from restriction for the RT group scheduling. One guest do not affect other guests by CPU-CPU set. So this slideshow that can study for this development. So target use case is a protect CPU resource for instrument cluster from other features. And provide stable runtime environment to instrument cluster. Our design goal is a CPU co-isolation with real-time capability for each guest container. And the issue is that did not realize a design goal by existing knowledge-based design. That means RT group sketch is conflicting is a CPU scheduling design. So new knowledge of this development. When we want to use the real-time scheduling, more the two guest container with CPU scheduling design, more better solution is disabling CPU RT group scheduling. When you use this design, we recommend it to set kernel restriction rate is 19.5% or more lower rate avoid some kernel errors. Because this evaluation timing, so go some kernel error out from slab allocator. So need to decrease 0.5% of the CPU usage. So that rate is dependent on your final system design. Final slideshow the conclusion. About this presentation, I talking about the azure instrument cluster is background and target systems and share our QMI solution concept. Share the case study for CPU resource control based QMI solution concept. So that information is defined to design goal and design and evaluate and consideration for the result and the improvement design and evaluate it. And the fact is that it's a contents of this point. Our future work is a continue to work, how to realize as a use case support improvement embedded Linux container usage. And one more note is other development work such as a USB device management on a use Linux container environment and some character device support and more device support work. So pre-check Azure MM this year's spring presentation. And this slide showing this link. That's all. Thank you for joining my presentation. Have a question or comments? Yeah, please wait for the microphone. Hi. Thanks for the presentation. Have you compared using GPU instead of, you know, separating one core for the display, the critical display systems because I was thinking if you use Vulkan or something like that and using these scripts, the GPU could update the display all the time and then you could just feed in new parameters to the scripts for the GPU as needed so that then you wouldn't have to, you know, wiggle with the resources of the CPU. Yeah. GPU is a good question. So currently we work to the CPU under... Please wait. We are developing the DR and RIS technology. It's already upstream of Azure. So it's a reality to the display device separation but the GPU separation is highly dependent to GPU architecture. So for example, one hour, your SOC have is two GPU, easy to separate and draw GPU or GPU resource and not protect the overload but the typical GPU not having the only one. In this case, we need to use virtualization technology. So this point is I need to develop extra development. Thank you. In RUNASA's case, in our trial development case, so it means RUNASA's case. It's having the GPU command priority realization. I try to use this point and I evaluate this but currently not provide more detail because it's having the more NDE-based information. It's included. So need to check a document of a GPU document. Need to check your GPU document. Yes, hello. Thank you for the presentation. Yes, I wanted to ask about the... You mentioned with you use namespaces for resource isolation within containers but if they have... So if a container has access to a kernel model, doesn't that affect also... Could affect also other containers? Yeah, so namespaces are easy to divide the resources. So network namespaces are completely hinders other names, other cases because a typical container design not shares a network namespace. If you use a network space and can separate the Wi-Fi device, it's a network device and some network-based devices. But all of the network devices not supporting the network namespace. One of the issues is the BlueZ. It's not supporting the network namespace. Okay, so it's only divides the resource by namespace because the other containers will not be able to access it but the usage of the module itself can actually access memory from other resources, right? From other containers. Your question is... Device separation is only... My design is device separation only to use namespace or not. The character device is only visible to one container? Yeah, yes, it's a typical case. That's right. For example, GPU case is used to contain us. Yeah, okay, thank you. All right, thank you. Thank you.