 So thank you everyone for coming. We're going to talk to you a little bit about the project We've been working on that read-outs as part of the automotive program and that's that has to do with managing services across different nodes So a few a few things to know I'm sure she won a pingoo. This is Michael that I'd like to introduce himself Yeah We have a logo so the cow team is the container on wheels teams you can see the logo on the on the top right and Because we have a logo we have stickers So if you have questions at the end will hand out stickers and you stick around we may still give you stickers if you want them So without further ado, what are we going to be talking about here? I'm going to be giving you a little bit of the introduction of the setting up the context of the challenges for running the for the automotive industries and what led us to you know, to get to that Militano service controller The the challenges of the automotive industries are multi facets One of the facets that has to do with this is it's a very very competitive landscape There are many Automative, you know car manufacturers around the world and some of them are very old You should take Pedro Pedro was found in an 1810 before the first car was even created Pedro was originally a kitchen appliance company. They were building kitchen tools Which is also why you can still find in some places a pleasure branded salt and paper grinder But then you know at the end of this at the end of century that started working on their first car in 1889 and Pedro is worth about 50 billion dollars today If you look at General Motors, which is the second car manufacturer worldwide It was founded in 18 in 1908. It's 47 billion. So if we look at the you know number two number two three and four from the bottom here we have Ford which is also a Very famous and you know pre-hauled automotive company that we all know about from our history books and economy classes about the Way for does industrialize the production of automotive. They're all about the same weight when we look at market capitalization Then we have Toyota Toyota is today the first car manufacturer in the world in the number of car produced per year Toyota leads the leads and it was created in 1937 Toyota is worth about four times as much as the previous three But then we have we have the young king around the block during the young kid around the block. That's Tesla Tesla is barely 20 years old. It's not even legal to drink in the US yet And yet it's weights Nearly, you know, 12 times as much as Ford that was created 100 years before So this lies doing something in the automotive industry that is, you know That is changing the landscape of things that is it's building something that the market is Believing in and that is a challenge for the old Automotive companies because they suddenly realize that, you know, there's a kid on the block that just created They just appeared and it's already weighing 10 times as much as I do and I've been there for a long time I'm an old timer So that's that's a that's a challenge. That's something they need to address What does Tesla do that we are not and that makes them at this place here? So then that's something they need to look into Something else that has changed over the last few years was COVID-19 and as much as we like that this is over The the the rip holes of that of that pandemic are still around there and one of the side effects was the ship shortage And we're still recovering from that I don't have a sources for that but I had heard at some point that During the pandemic car manufacturers would have been able to sell twice as many cars as they did Except they couldn't produce them They had customers they had, you know, they are salesmen sales. They are the customers. They just couldn't produce the product Just because of the ship shortage So that leads to some decision that needs to be met Something else that a change to do to the industry is the user expectations We no longer see we no longer have the same relationship that we have with our IT systems as we used to and one of the reason is simply our smartphones You update your smartphones. Apple is well known to be able to update the operating systems over the lifetime of the hardware Samsung as as announced a few years ago that they are now supporting the hardware for five years There is an expectation about the portability. There is an expectation about Life-cycle for our hardware that has changed. We we want updates. We want features We are we expecting that and when you get into a car and you realize a brand new car as Information in it that is maybe older that the brand new car that has been bought before Simply because car number two was actually started before the program for car number two was started before the program for car number one Something's wrong. I actually witnessed this myself. I had you know a friend of mine, but a Fairly fairly decent recent car brand new and the GPS that I was older than the car Which I had bought new few years before It's just that the car the model year of the car that he bought was older than the model year of the car that I bought Even though it was newer outside of the factory So our user expectation have changed something else that the automotive industry is looking at is this Devisification of revenue just being able when you sell a car you have no guarantee You know you sell it you have one you have income from one time But how can we make that income persist of a time? How can I make more money from the single car and? This There are a few ways that the industry is looking into all of these problems One of them is okay. We have a ship throttage. So we need to revise how we are building our cars means to revise. What's the onboard? Computing systems looked like so this is a slide from an XP which I found in one of the publication Which goes to what we currently have today It's called domain vehicle architecture domain architecture where you have a lot of small distinct Compute units across the car like a modern car can have nearly about hundred different computing units When you're looking for when you have a shortage of ships you can understand that's building difference under a distinct Computing it in the car is problem. So what they are looking into is more what they're called zonal architecture And the idea behind that is that you have less distinct ECUs that are really dedicated to something specific and more Bigger issues that are get our able of handling multiple of these discrepancies use tasks So you have less hardware, but bigger at where more more Powerful out where but also the other way that can potentially evolve over time the other way that is no longer Designed to be exactly doing this one task, but it is produced with a design that it may be doing something else in the future so the architecture of the of the vehicles are being worked on but The other way is only a part of the story the software becomes the other part of the story We need to we there is this concept called software defined vehicle and the idea is that by changing the software in the car We are able to change the experience of the driver in the car We are able to make the car evolve during its lifetime, but we're also able to customize it to the user's desires and wishes and needs So software defined vehicle, you know revising the architecture revising the hardware is part one revising the software How do we approach software in the car is part two and then? So what's the vision the vision ends up to be something that is very very similar to what we have on our smartphones We want to be able to do software updates And we want to be able to do that over the air just like you update your Android phone or your iOS phone Just by plugging it to the Wi-Fi You want to be able to to update the hardware? Well in the car still going to be you know take the car to the car dealer and potentially get a new a new computer in it That is more powerful Just like you go to your phone store and change your phone You want to be able to have? Applications being able to install these applications dynamically these application may give you new capabilities new features You know if you look at if you look at Tesla a few years ago There was an update to Tesla that has increased the cup the actual engine the the That has increased the horsepower of the engines basically you bought the car It was 150 horsepower so you update the software your car is 155 force powers Simply because they were able to optimize the way the software was able to get you know Working with the engine and just by changing the software They actually changed the physical capabilities of the car suddenly the car is going faster so then in the car is more powerful You know new features new capabilities can be also something fun. So, you know like you've missed on paying your cars Don't don't pay him on this month. So the car is going to drive itself to the garage Like who would not want that? Customization and building an experience. That's lots is more interesting It's like you know when you you're able to start to customize the experience of the driver based on the driver's Preferences which means you can build habits and when the car when the time comes to change your car You're actually going to try to find these habits that you've built in your car And therefore you try to build fidelity to a certain brand So how do we get there? I've already mentioned we simplified the hardware We also want to standardize and that's a place where the red out in vehicle OS becomes interesting It's a red out has been always very strong about start out and open start out in particular So having an operating system that relies on these standards or actually help developing on top of it and then we spoke about You know customization applications, so we spoke about container basically we want containers for process isolation That's been covered a little bit this morning about being able to ensure that processes do not interact with each other when they should not They don't impact one another Container also means that there are a specific life cycle management which we're used to so we can install we can update we can remove containers and Let's be honest container is is that they factor you know standards now in our industry Which means you know talent acquisition for car manufacturer become easier if they don't have they have to learn the specificities Often automotive product, but they don't have to relearn the entire ecosystems around it When we speak about containers, we want to speak about with an s then we want to speak about orchestrations as well and today when you think about container orchestrations We're practically speaking with communities. So I'm going to address the elephant in the room here Do we want communities in a car? And I already see someone say don't spoil it Leave the answer to some to other people So the answer is okay I'm going to spoil it We don't want communities in the car and there are a few reasons for that one of them is Covenities is built around the concept of eventual consistency Which means there is no guarantee when a change will be applied But there is so no guarantee of the order in which changes are going to be applied The extreme example that we always take but it's it's technically not a good example But it's always helped to understand it's like, you know You don't want to be driving car that will say is that you press a you press the brake pedal and then suddenly Eventually your car will break that is not an experience you want to do It's a bad example because we're not actually going to be involved in the brake systems But it gives the it gives the idea you want you don't want a car that Is working towards a state you want to be either in state a and state b you don't want to be somewhere along the journey You don't know exactly where And that's something else is that Kubernetes is fairly heavy weight Kubernetes and its derivative have been built around the container runtime that at that time was not able to give a Studies to to signify and when something had changed so Kubernetes have been built around the idea of I can't get the information So I need to go and get it you can't give it to me So I'm going I'm going to get it. So it's always asking like At some points it takes resources There was a consortium that did some investigation on that and on Raspberry Pi granted it's fairly low device But you know 15 to 20 percent of their system resources were used by Kubernetes Just doing nothing just communities being there and poking the container runtime asking how you done yet? Are you done yet? Are you done yet? Communities is a is a meant for distributed system. It's meant. It's great for cloud environment It's meant. It's great for worldwide distributed system But that also makes it a very complex system and a lot of that complexity is not needed in a car Things like end-demand scaling scaling out, you know, you I'm suddenly running It's black Friday in the US my store has a lot of input suddenly for for the sales You know being able to scale out to a pivley cloud so that I can get more resources so that I can accommodate to this sudden influx of Data makes sense But in a car you're not going to have that sudden influx of data The amount of data that you get from the review camera is going to be the all what is always going to be the same thing So that's killing out simply is not needed in a car a lot of that complexity that makes Kubernetes great in a number of environments especially the cloud environment just is not applicable in car another example is failures Kubernetes is designed around the or on the idea that you know if some of the pod fails is going to do its best to keep things working When something fails in a car, you don't want to keep working if something else in the car You want to know about it and you want to tell the drivers to take over and you know Take over the driving of the car put the car on the side of the road You don't want to keep working as best as I can even chilly No, I can't detect pedestrian anymore, but you know, that'd be all right. It's not acceptable in a car so What do we need we need something that's deterministic we need to know what runs where it runs when it runs We need something that is lightweight. We can't we have a resource constraints environments. We know something that is fast You use an expectation when you're driving your car you want things to work You want things to start you don't want to wait for things to be available and You know, it's it's a small bullet point here But it's actually one of the core element of what we are looking into is the functional safety foods up The functional safety certification is basically Certifying that your code is doing what you claim that it is doing that if I ask for it to write a Certain content in a file is going to write that content in a file and the more complex the system here is You know the more harder the functional safety certification process is going to be because you're going to have to go through every Function that are used in your code base and ensure that they are doing exactly what you say they are doing so To answer these all of this problematic we've worked on something called here to and I let Michael introduce it to you Thanks beer, so yeah as you already mentioned here is our answer to it and The basic idea behind it is to use system D to control local services on one machine But adding a thin layer on top of it. We are able to Manage those units remotely It's important to note here that we don't want to manage any state or so. We are just the facilitator of this management and Our approach here is to build a setup or a system that consists of Two components basically the controlling component we call here to that one is running on the main machine and This one is controlling of course, then all the connected agents So the here to agent is then running on each managed node And it gets basically the commands from her to and forwards them to system D So we are able to remotely start stop Well control units on remote machines We decided to go here and implement this with C considering those constraints to be as fast and lightweight as possible and Hopefully in the future to Frusa certified As the IPC mechanism we chose deep us Well since it was already used in system D and if you're wondering now How this is exactly being set up in here to Well, I'm going to show you So here to is running on the main node, which you see here on the left It reads on startup a configuration file where we can specify of course all kinds of settings For example the port that is listening for new connections It then goes ahead connects itself to the local system bus And provides a public API to it. So that other external applications for example, I guess state manager could use those this API to Control the whole system We already implemented something like here to CTL, which is Similar to system CTL from system D, but for a multi node use case and On the other side, we have the managed node Where a here-to-agent is running again reading some configuration for example where we have settings like the IP address of the main node and The agent connects itself to the local system DD bus via Unix domain socket And by this we are already able in the agent to control services on the managed node But the agent then goes ahead and wants to connect itself to her to based on the settings that we specified Namely it issues a connection request over TCP IP And here to itself response by creating a peer-to-peer debuts This peer-to-peer debuts is used exclusively between here to and the respective agent More on that later In addition here to does a look up When the agent registers it does a look up and checks if it can't find for example that then the node the node name and If you can't find the node name it rejects the whole connection request if it can't find it if it can Find it then it accepts the connection With that we are already able to control those units on remote nodes and of course we can Scale up so from one to end nodes which we all specify upfront in what those disconfiguration file and As you can see here on the left side on the main node We can of course run the here to agent alongside here to with basically the same mechanism, of course One question that might already arise is how do we deal with cross node dependencies like Like in this example here Consider we have the setup on the left side, we have a node new foo on the right side a node bar and Both are connected to here to and now we want to start the cow service on the node foo What we don't know yet is that the cow service requires the sheep service to run on the node bar Well How could we resolve that the kind of dependency? for once like I said we could use an external state manager that Basically uses the here to API but already knows that the cow service requires the sheep service So it would first start the sheep service Wait for it to run and then start the cow service What we added however was a feature the so-called proxy feature to push this dependency Resolving to to system D that a developer could at development time define Okay, I need the sheep service to run on node bar for the cow service and It works roughly like this You see here on the lower left side that the cow service requires a so-called template unit This is a unit file that here to provides to the developer Where you can pass in the tuple After the ad symbol for node underscore the unit so You specify the name of the unit that you require and the note that you expect this unit to be run on In our case We want to have the and that's actually a mistake It should be the sheep service Here's so it should be like hurt the minus proxy at bar underscore sheep service Please substitute you substitute that Which this template unit then Takes and passes it to a small binary to hit the proxy Which in turn? Separates those two input parameters and does a API call to the agent It's important to note that this API call is blocking so the proxy waits for the whole flow so It can distinguish between a successful or a failed set up and therefore this can reflect then of course in the cow service The agent forwards that request to here to Here to knows now on which node to run which unit so it creates a start request on the note bar in our example and It wants to start another template unit. We don't want to start it directly because then we cannot Because then we kind of limit the ability of the developer by using a template in between a template unit in between The developer has all the freedom to specify and define the sheep service. However, he wants it to run This template unit again then has a very weak dependency on the requested unit Meaning if it's not already running, we will start it But if it's already starting started then we don't care nothing happens basically Important to note. However, is that here to will keep track of all the references on The node unit tuple so we know basically when well, how many references there are and With this setup we Can already resolve this dependency Just by using already existing system D features and Of course we can do so by on development time. So to say Yeah after this And now I have a few examples that I can show you which I pre-recorded First of all for these examples, I use this setup I started a raspberry pi and Had an here-to-agent running on it And I also run here-to-agent on my laptop connecting both to here to which was also running on my laptop. I Was interacting with the system by using our built here to CTL and well, the first thing that comes to mind is Listing all the units that are on those nodes. So in this example I Varyed for all nodes the the units that are running or not running and I filtered them based on their name In this Specific case I wanted to have all units with the name that contain the bus in it So we see for example the bus service and bus socket on the laptop which are in an active state and are running But we also see for example some devices that are plugged We can also start and stop system the units in this case I first Filtered of course for the specific cow service that I wanted to start And we see that is currently in an inactive state and dead Then I started the the cow service with here to CTL start on the pi cow service and Often querying it again. We see it's active and running Again stopping same same procedures always stopping it here to CTL stop pi cow dot service listing again, and it's inactive and dead again When Starting this the scowl service I created beforehand a monitor So you can imagine those start and stop operations always involve some state changes internally on system D side So what if we wanted to to get notified on certain changes? We can set up a monitor with for example here to CTL monitor, and I wanted to get all units on the pi And this cut example because otherwise it got too large just shows the state changes between The states of the cow service. So you see in The second one you see that it currently had an inactive state then it changed to activating and finally it reached an active state and Of course, I could now probably Do some different operations if I need to And the last example is kind of similar Which is here to CTL monitor node minus connection. So As the name basically suggests we can also monitor the state of the nodes In this example, you see that the laptop and the pi were online then I went ahead on the agent On on the Raspberry Pi stop the agent and this immediately got reflected on this monitor So the pi was stated as flying Which is especially useful if I want to health monitor my system and do some Operations or have some fallbacks based on it Yeah, and that's already it some some questions So System they already has the mechanism to do remote connection But one of the system they already has a built-in mechanism to the remote connection But that's remote connection relies on SSH And SSH is not something we want to be running in the car because SSH cannot be Can hardly be limited in this cup It gives you a full set shell section which means if someone plays around in the car and gets a shell access to one of the Computing it it's not actually something we want and by by using the agent and the control We are actually able to expose in one port only the you know the management to control services You won't be able to install a root kit by you know by this because you can only control system services The other problems. No, so that's part one the the other question is Doing this in system itself is something that we've been thinking about we would love to to be able to have it in system proper But there was a timing perspective Being able to work with the system the community to to polish it and make it to up to the system these standards We probably have taken more time than we were then we are available to get this project in a state where it's usable So I would still love to do to see that happening, but I it's we also needed to you know Figuring out is this what we wanted? You know we started with a concept. We talked with the car manufacturer that we're involved in does that satisfy your solution? Do you see things that are missing? This presentation is still something about us trying to figure out Are there seen missing are we missing something in the picture here? Are we did we forget something? There are always more brains in two heads than in one So there is always a capability that we've that we overseen something and if we were to implement that in system Did and we would not have the flexibility about potentially, you know changing our approach if we needed to So that would be mark my questions. We needed to validate our power approach We cannot use SSH and doing it in system in proper would be ideal I The question is about leveraging system D for dependency resolution and it's exactly what we want to do That's why we've implemented the process service because we don't want to have to deal with Figuring out the order. What can I start in parallel? What needs to be sequential and since and your ideas over the logic and it's built for that So we actually want to leverage as much as system Yes, we can and we try to to complement it rather than we implement it I Take you I'm trying to see if I understood if I understood all the question So basically your question is if we already have a project where we applied here to or So did we start by considering communities before we implemented it? Yes, we did but we built the analysis that I have presented earlier We and we did the analysis that I presented earlier and figure that despite a lot of people being asked Asking us I want we want to run communities in the cars We conclude that communities is great too but not to run in the specific environment that carries Because it's complex because it says it's heavyweight because the complexity is going to make the functional safety analysis practically impossible Because it's it has the eventual consistency that is built around So all of these make it like yeah, it's it's not suitable for the in-carb Yeah, one of the things that and I think it's gonna have to be the last question because we're out of time one of the things that a Kubernetes does it that it's very hard to ask to impose communities run certain payloads on certain systems and in the automotive world You need to be able to test the entirety of your system in bench in silico in bench before you actually start in the car So that you covered all of your basis when it comes with I have a critical system that I need to ensure that this critical system always have enough resources and then the question of the dynamism of I want this container to run wherever there are resources is actually something that Is being considered, but we don't have a monster We don't have the proper answer for it yet because we still need to ensure that you know adding adding an extra container adding an extra container on a specific system is not going to interfere with critical critical systems that are already running in there. So there is The environment the in-carb in vehicle environments is a lot can be a lot more static than what is used to deal with and that's They are basically built really for different worlds and there is it's going to be hard to make communities work for Front-end TVs guys, but I think we're going to have to stop in here We can finish outside