 And this session coming up is Antonio and he's going to be talking about building an operating system on the edge. If you just want to come up here Antonio. Can you hear me? Yep, coming in loud and clear. No video though. Let me short my screen first. Can you see that? Cool, yep, you're working. Okay, so camera is not working. I'm sorry for that. Well, I guess we can go without that. So assuming you can see my screen and slides should be up. So first of all, welcome everybody. Thanks for joining this presentation. My name is Antonio. I'm usually known as Branko online. I work at Radat where I used to work on containers with Docker at the beginning. And now I'm working for the edge team for about a year where guess what, as this slide says, we're building an operating system for the edge. In this talk. The agenda is like this we're going to first. You know, introduce what edge and edge computing is really brief. What are the requirements that we have to meet in order to operate at the edge, because it's going to be different from, you know, the usual data center model. And then I'm going to dive into the most important technologies that we are using today to, you know, to enable this. Use case at the edge and then I'll, you know, we'll put all of this together into what we called. Relfar edge and I'll give a brief demo of one of the key components of the whole technology. So edge. So in a live conference, I would have done a joke here. So everybody, you cannot click that link, but if you want, you can go on Google and type edge that may give you a hint of what edges or not. But it was funny if it was live so you can do that. Other than that jokes aside. What is edge computing. So edge computing is a computing that takes place at or near the physical location of either the user or the source of data. This definition is spread across the internet, but it's, it's intuitive to think about edge computing as use cases like automotive vehicles, like your car as a small device, well, smallish device that lets you be connected while you drive. So that's definitely an edge use case. You can think of edge computing as, you know, light bulbs on the street to with sensor to turn them off when it when the sun is up or on when the sun is going down, or think about the forecast sensors at the top of the mountain. So all of these are edge use cases other great examples spent from the 5G network. I'm sure everybody heard about that to, you know, you name them like submarines, tanks, whatever, all of this, which is different from the data center. So all edge computing and as said here, the computation takes place near the physical location, or, you know, the either user or the source of data. So there is this difference between, you know, data center and the edge. So in a data center, one of the first difference that we can see, of course, it's the hardware itself in the data center, we have like big supercomputers really heavy, whether at the edge, we usually have single board computers, those tiny devices like a Raspberry Pi or a Fitlet tool. And those usually don't have much resources in the center, like those supercomputers or come, you know, just servers have plenty of CPUs, more than one. Sometimes plenty of network connectivity, like in, you know, gigabit ethernet directly connected to the rack, plenty of memory, but just not limited at, you know, one or two gig, whether, again, at the edge, we usually have devices with no more than two gig top, I'd say. But usually you work with much less storage is also a key factor for edge versus data center. In the data center, you have like, say, and, you know, the cloud itself is, you know, unlimited storage if you if you want to use that, whether at the edge, we have, you know, not that much space to to work with. So, the requirements for for edge and for building an operating system are like the first one is of course, it's not directly related to the, to the operating system itself but the hardware as I said earlier is going to be small, of course, resources are going to be limited that the operating system itself, we cannot spare memory or ask the device to do, you know, CPU intensive computation or, you know, think that it has, you know, unlimited network connectivity that is always stable so none of that is available at the edge, or, you know, in the in the normal case it's not available we're assuming it's not available. Security is also a key difference at the data center usually there is a an actual door to the data center itself where the service are racked. Usually you need a badge you have to slide that somewhere in Daniel allowed in there is access control in place. So, security from that perspective is, let's say it's high, whether at the edge, you know, you can go to the top of the mountain. Take that tiny device bring that home unencrypted if it's encrypted and if you can, you know, unencrypted, unencrypted. So, you can see there is, there is a huge difference here, usually at the edge, you want to make sure that your device is bullet proof, but it's, it's secure and nobody can can access it. They can steal it, of course, physically, but they won't be able to play with it. Update is also a key difference. At the edge, in the data center, if something goes wrong, you would send an IT person to go there, debug, you know, perhaps retry the update at some point that will go just fine. At the edge, if an update is, you know, has caused an unbootable system, you're pretty much done with it because it's going to be, you know, it's going to be super expensive to send somebody on the top of the mountain or somewhere at a remote location. And, you know, for this reason, you want updates and management in general to be as smooth as possible and automated as possible so that you just deploy the device and then you manage that from a central location. And if something goes wrong, there are, there's something that you can do to avoid sending somebody there. Provisioning is what I just said. More or less, it's zero touch. And what we mean by that is you have an edge device, you know, or a tiny device and in order to provision that you would just send the device wherever it's needed and just plug it in somewhere, power on. And everything is going to be taken care of automatically without any intervention. So you don't need to send an engineer at the site, the remote side to do all the configuration. So because in this scenario, you probably have plenty of devices, tiny devices that you want to ship. So it's not maybe a huge server where you configure it once. You send it to the data center, you power on, you configure it, and then you're done. So this is also different. And management is the last one that I wanted to talk about. In the data center or in the cloud, for what it's worth, you know, Kubernetes is probably the default to run your workloads where you have plenty of RAM, plenty of CPU network. It's as, you know, it's always up and running, and it's fast. On the edge, this is not the case. Most of those devices are going to be connected via Wi-Fi. And there can be interruptions at any time. It's not as tight as a data center. So in the edge use case, we're not at this point. I'm sure you've seen the microchip presentation. That's something that that team is tackling for sure. But, you know, microchip itself is definitely tinier than a Q-blit. And it can run in less than two gig just fine. So that's about the, you know, the requirements that we have. And let's explore some of the technologies that we are using. This is a small list. Well, not that small, but it's a list of the key technologies that we've decided to use in order to make this operating system for the edge. And I'm going to explain them one by one, and then we'll, you know, gather all of this together into a demo. So the first one is, you know, the first technology that I'm sure everybody's familiar with is UFI. You know, for the past 10 years, UFI has been the default on every window laptop, I guess, I think. And definitely it has an advantage over, you know, the legacy BIOS, which is it's secure. You can use TPM. And for us, one requirement that I haven't mentioned is that usually you let, you know, a normal scenario would be you have a hundred of devices that you want to install this operating system onto. And, you know, if you're familiar with Pixi, that has been a way to install on more than one device in the past, but Pixi is unsecure because it's GFTP and it's low. So one advantage of UFI over BIOS or Pixi itself is that UFI provides us with HTTP and HTTPS boot. That means on a manufacturer plant, if you need to provision a thousand of devices, a thousand devices, you would just bring up an HTTP server, deploy the kernel and the NITAR-D and just power on those devices that are going to automatically connect to this HTTP server and just install. So this was the reason why we went full UFI. This new operating system is going to support just UFI. We, of course, taken REL as a base for our operating system for the edge. I think there is very few to say here, but, you know, REL is trusted by many in the data center. It's battle tested. So our task here was to make it ready for the edge and, you know, REL runs on most any platform out there that is supported. So the task here was to make it, you know, leaner and smaller so that it can fit on a tiny device that we can ship at the edge. And, of course, the next technology that we're going to talk about and that we're going to use and we're using in this new operating system is OS3 and RPM OS3. And this is what, you know, OpenShift is also using by Arcos and Fedora CoroS and it has a huge advantage of being, you know, transactional image-based upgrades. And for us, the most important part for choosing OS3 and REL was the ability to roll back your operating system. So updates are transactional. You just stage them first and then you deploy them at the next reboot. So if an update is available at the edge, the edge device would fetch it, stage it and then at some point when, you know, the management platform of the operators think it's time they can just reboot the machine and they will, you know, boot into the new update. And, of course, with OS3 and RPM OS3 we can make our own derivatives. So that's something we're exploring also for our for edge. Paired with that, there is this tool that we're going to use which is called Brimboot and it's an out check framework for system D on RPM OS3. What does this mean? This is highly tied to OS3 and RPM OS3 from the slides before because if an update goes wrong at the next boot, by using Brimboot, we are able to notify that and also, you know, take action on that and say, okay, if the update went wrong, what we're going to do is just boot into the old deployment, which was working. And along with that, we can do all sort of things like notifying a central management platform or send an email to the IT department and say, you know, this went wrong and, you know, this is not updated, you should take action on it. The next piece that I'm going to talk about is CoroS installer. CoroS installer is how, is the way we are installing the disobedient system onto a device. So on edge devices, our flow is to build a raw image and then flash that to the device device. It can be an SSD or a memory card or whatever. CoroS installer fits perfectly these requirements. What we need is just a flashing mechanism. So, you know, with optional compression support, so you can take an XZ compressed image, run CoroS installer and that will flash that, you know, will DD that image onto a device. You know, we've chosen CoroS installer because it has a Dracot module that we've been developing. So that means we can ship a tiny in-inter-d system based on REL, of course. That runs CoroS installer and does this flashing for us and just reboot. It has, we've added encryption support so that it can work with encrypted raw image as well. And I'm going to talk about encryption later in the slide too. It has a system D-based framework to integrate with what it means. It's run as a system D service. So if we have anything that we want to plug in between CoroS installer, whether, you know, it's before that or after that, there is system D that does that for us. And lastly, it has support for growing the root file system. So usually, the raw images are in the order of, you know, one or two gig in size. But maybe your device has a 40 gig or even another gig storage device. So, you know, after flashing that, you definitely want to grow the root file system to take all the space on the device itself. This brings us to Anaconda. And again, I'm sure everybody's familiar with this, but Anaconda is the installer that we're using today for REL and Fedora. You know, it has a nice GUI. You can click here and there, configure the system, install it, and then reboot and you have your system installed. It is also a way to do that unattended by using kickstarts, where you script your configuration or installation and you can do that automatically too. But it had some, you know, disadvantageous for us. Meaning, I wrote old code, but that's not meant to be offensive, of course, but Anaconda has been around forever. So it has code that we don't want to bring into the system or the installer itself, because it can be a potential attack surface. So we're not shipping that. We're not using Anaconda for that reason. And we needed something which was definitely smaller, leaner, and that, you know, it does just what we need to, which is flashing a raw image, do some configuration and just power off until somebody power it on and decide. So we haven't created a new installer that wasn't the intention there. We're just not using Anaconda. And since that went and used CoroS installer, and we did that by, you know, packaging CoroS installer in a way that we called, you know, the end artifact. It's called the simplified network installer. And it's just, you know, a tiny, any 3D system based on RAL that has CoroS installer in it. It boots only with UFI. For the reason I've explained it earlier, we're just allowing UFI here. It won't boot on legacy BIOS system for security reasons. And also this allows us to leverage HTTP and HTTPS boot. So as I explained earlier, if you have a thousand of devices, you would use the simplified network installer to provision them all at once. But just, you know, powering on the device itself is going to reach the server and do the configuration. So the simplified network installer is just this tiny, any 3D system based on RAL with CoroS installer and also contains a raw image of the system that we want to install. That's RAL with S3 as explained earlier. And it's integrated with FidoFDO. I'm going to talk about that in a moment. But it's the way we're going to provision and onboard, maybe just onboard the device to a management platform into the configuration. It comes with two artifacts with HTTP boot. It's the main scenario that we're going to tackle with the simplified network installer. But we're shipping the artifact as an ISO so that if you want to test that, you can just flash the ISO onto a USB stick, plug that into your FITLA2, and just run the installation, which is of course unattended because we don't need anybody to attend that because it's completely automatic. And this brings us to FidoFDO somewhere. So yesterday Patrick, my colleague on the team, did a great presentation around FidoFDO. I'm skimming through Fido year just to give you an idea of what this is. And so the FidoFDO is the Fire Alliance IoT specification. And FDO is FidoDevice on board, which is an automatic onboarding protocol for IoT devices. So device onboarding is the process of installing secrets and configuration data into a device. Again, go watch Patrick's session because he explained this in deep details. And so the device onboarding, as I said, is the process of installing secrets and configuration data into a device so that the device is able to connect and interact securely with, for instance, an IoT platform. And the IoT platform is then used by the device owner to manage the device itself, like patching security vulnerabilities, installing or updating software, retrieving sensor data, interacting with actuators and whatnot. And what it's important for us is that FidoDevice on board is an automatic onboarding mechanism, meaning that it's invoked autonomously and performs only limited specific interaction with the whole environment to complete. And this was, of course, one of the key requirements that we had to meet. You can see at the bottom there is a configuration that's probably one of the most important things that we need to do because we're going to install the operating system and then we need to configure it to be able to run it yet. This is a slide taken from the Fido page that just explained the flow of provisioning and onboarding with Fido. Assuming you can see my cursor, the device is built at the manufacturer plant and at that point the manufacturer creates what we call, what it's called, a device credential. Then it will just pack it up and just ship it to a customer or to a site if it has to be deployed. At least the mind-blowing thing for me here is that once it's shipped and it has a device credential already on the device itself, if you just power it on, the whole protocol is going to take care of registering the device to a device management system and do the onboarding, meaning you register the device, you ship some initial security update, you create an SSH key for an administrator account and whatnot. So all of this is taken care of by FDO or Fido secure device onboarding. And then of course the device is ready to run at the edge. The architecture of Fido itself, it has some pieces that needs to be deployed and some pieces have to be into the system itself. For the initial part at the manufacturer plant, there is going to be a manufacturing server and then a client is going to be run from the simplified installer where it's going to ask the server to validate and create those device credentials. Again, on the client side, on the system itself, there is a client that uses those device credentials to authenticate against, you know, here there is a rendezvous and owner onboarding server that's on the central management, on the device management system and that's where the device is going to authenticate with that and receive configuration updates and everything. There's also the bug tool just to make sure device credentials are correct in any debug-related situation. There is a link in this slide which I'm sure I'm going to share at the end. As I said earlier, this is, there was a presentation yesterday from Patrick. I did a great overview on Fido so I encourage everybody to go and have a look at that. This is just a split down of what Fido does for us. I think I've explained that earlier as provisioning, which is a de-manufacturing site, when the installer runs, we get the credentials from the manufacturing server and we store them either on the file system, which is what we're doing right now or, you know, in the future it can be the TPM for security reasons and then onboarding those credentials are used to securely onboard the device. This brings us to another key requirement of running an operating system at the edge, which is encryption, because we don't want the device to be stolen and just, you know, somebody will rip off the storage and read data if there's, you know, sensible data onto the device. We don't want that, what we're going to do is encrypting by default. We're not shipping this new operating system unencrypted. What we did was turning on full disk encryption by using locks and by, this is not yet ready, but the aim is at installation time, we're going to encrypt the device storage with a pin. If you're not familiar with Clavis, there are various pins that you can use. One of the most famous is the TPM pin or the SSS or the tank pin. So the first thing we're going to do at installation is flashing a raw image which is encrypted with a null pin, meaning it has encryption, but of course it's open to everybody. And then at onboarding, which is when, as I said, the configuration happens and, you know, the device itself may start to receive data that we don't want to leak. At that point at onboarding time, the device itself is going to re-encrypt itself using a strong key and storing that into the TPM or, you know, for now it's TPM, but I'm sure we'll come up with even things more secure than that so that the device is fully encrypted at any time. And the question that we often get is what if the device is broken at some point? Well, that was the reason for doing full disk encryption. If it's broken, you can leave it at the top of the mountain even if somebody stills that yet they can crack encryption that's going to take a long time so it's highly unlikely. And this is again a key difference with the data center scenario where if your device breaks again, you can send somebody in the data center to debug that. So the next technology that we are leveraging for this new Oberyn system is NOS Build Composer. It's where we put all of this together. We say, you know, OS Build Composer supports pipelines as this concept of pipelines and what we did there was creating a pipeline to create an OS3 well-based system commit that to an OS3 commit, of course, deploy that and create a raw image and then bring that raw image onto an ISO where the simplified network installer is so that those artifacts are ready for the user to use or customers. So this is where we put all the things together. The end result of this is usually an ISO where you have the raw image and the simplified network installer and you're ready to go and install and deploy the system at the edge. And it does, again, disadvantage for us, it's highly configurable in code. We just have to take care of a pipeline defining packages, defining the configuration that we need like initial kernel arguments and things like that. So it works. We have an OS3 version and our own prime version that allows everybody to run it and have their own internal flow for this. And of course, we're going to use containers, the FIDO servers that we're going to use and I'm going to demo just one, the manufacturing server. Then there are other two servers, the Rendezvous server and the owner onboarding server. Those three servers are going to be run in containers. But we have plans with Podman, of course. We have plans of running everything in containers. That's a semi-joke, including Bash in the future. We may need to do that at some point. We may want to do that at some point because we want to lower the attack surface on the device itself. So if we can run everything in containers, that's definitely more secure and meets the security requirements that we gave ourselves earlier. Yeah, so this is all around the technologies that we are using right now. Of course, this is going to evolve. It's down working on this. It has been over a year, but we keep working on this and we're going to work on this. Then let's get to the demo. Hopefully nothing is going to break. What I wanted to show is the initial phase of pretty much the workflow that I've explained so far. We need to build a simplified installer, a networking installer. We'll see that it's going to be an ISO. As I explained, the ISO can be unpacked onto an HTTP web server, and then you can have UFI on devices connect to that and install it. For the sake of the demo, I'm sure that would be broken if I did that. For the sake of the demo, we're using the ISO directly. Still boots with just UFI, and I'm going to use a virtual machine on my laptop to do that. First of all, as I've outlined into this slide, you can see the first part of this is done at the manufacturing plant where the device are actually assembled and then provisioned. They need a simplified network installer as a mean to put the system on the device itself. They need the final part that they need is the manufacturing server. If we go here, I'm sure this is super tiny and you cannot see that. What's running at the very bottom is the manufacturing server. That's the server. It's running in our container, of course, and that's the server that takes care of creating and handling device credentials on the new device. Right here, we're using Westbill Composer to run and create the simplified network installer. I've created that already because it takes 15 minutes and we don't have the whole day today. I just want to explain how we use Westbill to build a simplified network installer. It's just a matter of defining a Tomo configuration file where we name it. Somehow I'm choosing round nine as the distribution to use. That's also the one that I've used for the raw image itself, so for the actual system that gets installed on the device. You can see that the simplified network installer requires very few configurations to be put here. The first thing that it has to know about is the storage device that it has to install to. Of course, I'm using virtual machine. The target device that I want to write the raw image to is slash dev slash BDA, but this can be anything like on my Fitlet 2 is dev, SDB or whatever. Of course, this is something that is required upfront. In the future, we may be experimenting with automatic detection of a storage device or partition that has turned out to be difficult for various reasons, but perhaps we're going to explore that. Suffice to say that right now you just need the target device to install to. Then at the bottom, you can see this is the most important thing that we have to configure because the installation will ask this server to provide device credentials, authenticate against the server itself, and write those credentials to the system, to the installed system. The last line is a security mechanism to authenticate against the manufacturing server. This is using HTTP, but we have support for, we will have, I think, support for HTTPS, but this hash right here is a way to authenticate against the actual manufacturing server because the protocol exchanges certificates, and so that's just a security measure. I'm not going to say that you can disable this altogether, but you can, but for the demo, we're leaving it here so that it's, you know, secure. And then the, so yeah, so I think what we can do is, I'll show you the simplified installer. I'm not sure you can see this. So I've mounted the simplified installer at Mount ISO. I want to show the size of the installer itself. It's a, you know, 787 meg. So this fits even on a CD-ROM, but as devices want to have CD-ROM, but anyway, what I did there was just unpack it so that we can see what's inside. And I think the first thing that we all notice is the red disk image.xz. That's the actual operating system that gets flashed onto the device. As you can see, it's compressed, but as I've explained earlier, CoreS installer supports that. So there's no issue there. And you can also see the, you know, the usual directory structure to make this bootable. And this is just UFI, as you can see. Here we have the NetRD and VM Linux kernel. In the NetRD, you know, that's where CoreS installer is. And all the FIDO, FDO clients are running, are going to run. DFI folder contains usually UFI structure. There is app.com where we can look at. And we can check out some of the options for this installation. Of course, there is the kernel that we have to run. There is network, which is required to reach out to the manufacturing server and do the initial device credential exchange. There is this CryptoRoot equal one. This is a stop cut for now until we land a feature upstream so that it's encrypted by default. I'm going to drop that once we install. We can see that through the kernel arguments, we're going to also configure the target storage device. We're telling CoreS installer to grab this raw image of course in the HTTP boot use case, there is an option which is not image file but it's image URL and you can point that anywhere on the Internet to fetch a raw image. In the HTTP boot case, the manufacturer or whoever is installing the system surely has an HTTP server where they can point this option to and they will just fetch and install the raw image from there. This is all mirror read from the configuration. There is a manufacturing server URL. The ZIN secure is just for the demo itself because all of this is insecure. And this is then the security check on the ish that the installer is going to do against the manufacturing server. I think, yeah, this is the simplified installer and what I wanted to show is just how easy it is to install a system. I have my finger crossed because it worked just fine. Ten minutes before this presentation, we're going to run as I say the virtual machine and you can see that I'm using UFI and the installer. So if we can start this one, default on the... I think I can do this. Okay, so I hope everybody can see this. I cannot zoom on this virtual machine, but you can see text scrolling and at around here where the cursor is, we can see the command line that has been just executed, which is called as installer. And what this is doing is, as I explained, it's taking that raw image, compressed raw image, uncompress it and flashing that through the device. It's around 50%. Right now, what's coming next is we're going to reach out to the manufacturing server. It's running gear and we can see that we're going to hit that endpoint so that the device credentials are going to be created for this device and in this case, it's just this virtual machine, but this applies to physical device too. And then at the very end, hopefully, this is not too fast because what this does is just reboot into the installed system. We can see also the third system growing routine where, again, I've given an 100 gig for this device as storage. The raw image is 2 gig or something uncompressed. So when we boot into the actual system, this failed, of course, it happens. Yeah, so I haven't dropped the crypto route. So I think it's sounding cryptid. Let me do this one more time. I should have dropped that. You can see this is a normal boot message. We're going to drop this and we're going to start the installation. But hopefully you were able to see that the device at installation time went through the manufacturing server because I left some space there and that's what we're going to do again. So if you see logs there, that means the installation hit that endpoint. And it's rolling again. So let's give it some more seconds at this point to go through. And then we're going to log in into the system. I'm going to show the configuration that we've baked into the raw image. Now it's just, you know, I've said to let's build to create this raw image with just a username and admin and a password admin in group will. And I did that because this is still going. So I think I can briefly explain the owner onboarding part of all of this. So again, go watch Patrick presentation for the sake of this demo. There has to be also a onboarding server that will use those device credentials to authenticate against this owner onboarding server. And when that happens, the owner onboarding server is going to tell the device to do something. In this case, for test purposes, we've said, OK, if a device connects to any, you know, and you need to onboard that for the admin user, create an SSH key named test key. Of course, this is just for the demo, but in real life, the owner onboarding server is going to provide actual SSH keys. Another thing that we're doing the very first time it boots, and you can see on the right side of my screen that it's booting and installed Rav9 just fine. In the encryption case, at this point, there is going to be the client connecting to the owner onboarding server receive this configuration like adding a user, adding an SSH key for a user, re-encrypting the disk, and then at some point, the admin can just decide to reboot and it's going to be fully encrypted with a strong key stored into the TPM. And as explained earlier, the installation case, it's encrypted, but it's encrypted with no key. So that's just something that we need to do in order to provide a way to re-encrypt later and mandate encryption, because that is again what we really want to do. So this is the installed system. You can see it's Rav9. It's an OS3-based system. Let me show the release file, and we haven't noticed because that was too fast, but after flashing the raw image to this virtual machine, the manufacturing process kicked in, and that worked, the system rebooted, and you have a fully working Ralf for Edge 9 machine. Let me go back here. So that is the demo. Fifty minutes are definitely not that much to explain everything and demo everything. But I want to end the session with the walks next. On our plate for Ralf for Edge, we want to make the system smaller and smaller by removing unneeded packages or packages and tools that we can run in containers. There is a gap that we're filling around configuration. As I've explained, there is the owner onboarding server that does some initial configuration, but we need a better way to do that. So that's another area that we are exploring. I've mentioned a technician because that would be a candidate or just a way to configure the system. And then what I have in demoed and what we're working on too is a device management system where you have a nice UI where you can see all the devices, you can update them, you can see, hey, this device is at the version with the CV. So you really need to update that. You can see the device that died perhaps. So they're not working anymore. They're not pinging back to the server, to the central device management system. And this is it. I left some links at the end. Those are the main ones from there. There are more links to go and hit. And everything that I've explained is contained in those. And I've just hit 50 minutes. Thank you, everybody.