 I want to welcome you to my talk about Raoq evolution or maybe revolution of an update framework. Short notes about me. My name is Enrico Jerns. I'm an embedded software architect developer, whatever it is. I'm the co-maintenor of the update framework Raoq that I'm gonna talk about today. And I work as a integrator, a system engineer at Pengatronics. For those who don't know Pengatronics, it's a company providing embedded Linux consulting and support for customers from industry automotive or whatever. And we work very closely together with mainline projects with the community and have about 6,000 pages also in the Linux kernel. So short structure about what you're gonna hear in the next minutes. Short introduction for those who don't are familiar with Raoq. Then I'm gonna talk about the initial bundle format, the update format of Raoq. And yeah, it says initial. There's also probably a new one. This is a verity bundle format. And based on this, I talk about the technologies that have been enabled by this new update format. It's bundle streaming. It's adaptive updates, encryption of updates. And at the end, if I don't run out of time, I will have a short outlook of new feature that I planned and have a short look about what is happening around Raoq in the community. So let's start with the overview. So the big picture of an update system looks like this. So we have somewhere here our build server that builds our embedded Linux system. And from that embedded Linux system, then we generate our update artifact. And this update artifact is then uploaded somewhere, another deployment server, somewhere in the cloud. And the individual devices in the field then fetch this update from the update server and install it. But there's not only these, over the air update case, but there's still the older, let's say conventional approach of just using a USB stick and going to the device and updating it. So it's a use case that is still very common in some places. So Raoq, the scope of Raoq, the tool that we're gonna talk about today is mainly two things. So it's first of all, the service that is running on the embedded Linux target and handles the failsafe installation of the updates. And the second part of Raoq is it's a generation and the signing of the update artifacts. So this is typically the last step that happens in your embedded Linux generation build system. So when you build the Yocto image, the last thing is then, for example, that yeah, you generate an update bundle and Raoq also supports you with this. A few details or a few key facts about Raoq, it's basically an embedded Linux update framework. It is written in C code with using Glib as a utility library with using open SSL for everything that has done with crypto and for curl for the network functionality. It's licensed under LGPL version 2.1. And yeah, we have a toasted on GitHub and all the community interaction also happens on GitHub. And yeah, what it basically provides is a failsafe image-based atomic updating of AB system. So we done and see boot interacting with the build order and something like this. And it also handles the cryptographic signing on the server side and the cryptographic verification of updates on the target side. So two things important to know when we talk about Raoq are the configuration files. So there's one configuration file on the target called the system configuration here on the left. And the system configuration provides basic information for the Raoq update service which boot loader it has to interact and how the redundancy setup is so with a concept of update slots with individual update targets and classes. So you can say, okay, I want to have an update for a root file system and then Raoq decides where the exact slot of the root file system is that should be updated. This is one part and the other part is on the right here. That's the configuration file in the update itself. It's called the update manifest. So the container archive used for updating in Raoq we call bundle. And yeah, here also some basic information, meta information like a version. And the most important here there's a description of what the purpose of the images inside the update are. So this just has a quick idea of how that looks like. So the initial bundle format. First of all, why we do the verification and signing of the bundle format and why don't we use authenticated channels? So with authenticated channels like you would do when using SSL to your server, you can be sure that your update is not modified during transport and you are sure that you're talking to the right server but you're not sure that what happens with your update artifact and on the server and you are not capable of using a USB stick for example. So we have decided to use signed artifacts. You are free to choose to put them to whatever unsafe storage location you would like to use USB stick or whatever cloud service provider you might like to choose from. So the bundle generation in Raoq is quite easy. You just call the Raoq bundle command with some options and what it basically does, it takes an input directory with all the artifacts that should be inside the bundle and the manifest as we've just seen. It creates a squash file system of it and then it signs the entire squash file system and appends the signature to the squash file system. The signature is based on X509 cryptography. We use OpenSSL method for this and the container format we use is CMS. And yeah, that's basically all we add size to later being able to locate the signature. So on the target side then what we do is when calling Raoq install, we use the size I just mentioned to locate the signature and then we use the signature to verify the integrity of the entire squash file system. And once we've verified it, we can mount it and then we can access the individual file. So this is a quite straight approach and the benefit of using SquashFS over any archive is that we can directly mount it and don't have to unpack anything. But a drawback of this is that we have to verify the entire squash file system once we install it. So that can be a bit slow. Another issue came up when we were informed by a security research team, I think it was in August 2020, and they informed us that we have a time of check, time of use, vulnerability in Raoq. And the short description is that we first verify the bundle, then in a second separate step, we mount the bundle. And in between, we close the file descriptor of the bundle and also for mounting, we invoked the mount command. So there's time in between wherein the attacker could replace the bundle file. And also after having mounted, it was potentially able to modify the actual bundle content. So yeah, this was disclosed in December 2020, fixed in version 1.5. So what we did were two main mitigations. The first one was to not close the file descriptor between the verification and the mounting of the bundle. And the second was to ensure that we as Raoq service have exclusive access to the bundle we're talking about. But yeah, this shows the limitation of the current format. And together with these mitigations, we also developed a new bundle format called the Verity format. And this was more or less a turning point for the project as this enabled a lot of new developments based on this new Verity format. So what we're gonna talk about next is the Verity format. So a short background, so almost all storage devices in Linux are abstracted as block devices. So file systems operate on top of block devices. And yeah, then they're individual device drivers for the different storage individually. And now there's the kernel device method. The kernel device method is an abstraction layer on top of a block device that for the file system using it, again, looks like a normal block device, but it is able to manipulate data before accessing the underlying block devices. You might know it from the logical volume manager or from DMTRIP that you have probably or hopeful on your laptops to have full disk encryption. And yeah, there are a lot of different device methods available in the Linux kernel. It's called the Verity format because we use the DM Verity to kernel device method. So DM Verity is for integrity checking of read-only block images. So we create a hash tree for verifying the block image. This works by splitting up the block image into different blocks. And we calculate the hash of each block and store the hash. And as soon as we have enough hashes that they fit into a block, then we again create a hash of this block. We do this procedure that is why it generates a tree until we only have a single block and then create a hash from it that we call the root hash. And this is then our source of trust for DM Verity. So when reading from a DM Verity kernel device method, you read a specific block. What DM Verity does is it hashes a block and compares it to the hash it has stored. If that matches, then it has to verify the integrity of the block where the hash is stored inside. It again does hashing in comparison. So it does this down until the root block where it then creates a hash from and compares it to the root hash that we've stored somewhere separately. And if that matches everything is fine, we can read the block. If there is a mismatch, then we get an IO error. So this is the basic concept of DM Verity. So the new Verity based format is again created with a rock bundle command. The only difference is that in the bundle manifest we have here these format equals Verity. And the creation process is a bit different. We start again with the directory containing of files, then create a squash file system. And then we create our DM Verity root hash that we append to the squash file system. And we pick the root hash and put it into a copy of the manifest extracted from the directory. And what we then sign is only the manifest. So CMS also allows putting this data that we sign inside. So yeah, the command is equal, but it's very different when we install the update. So the first thing is what we verify is only the inline manifest here. This is much quicker than having to verify the entire squash fs. And then we can directly mount the squash fs and install our data from this because yeah, using DM Verity, we do the verification of the payload just on demand. So when we read the bundle, then the DM Verity device mapper will verify the data on demand. So this provides fast initial verification and this also prevents us from running into a time of check time of use vulnerability again as we did before. So a logical consequence of this was to implement bundle streaming. What is bundle streaming? So normally when you install a bundle, you download the bundle to a data partition. So you have to have some extra storage where your bundle fits to. And then as a next step after having downloaded it, you install the update to your device. So streaming, the idea of streaming is to just be able to write the update directly from the server to the target partition. So this is how this is implemented in Raoq in a rough picture. So first of all, for downloading, you don't want to use the Raoq service itself. It runs as root downloading something as root. It's not such a good idea. So the first thing we do is spawning an unprivileged helper process. And unprivileged helper process an interface that plays a block device on one side and it communicates with the server where the bundle is located on the other side. And then it translates access to the block device to HTTP range requests to the server. And this allows us then to have random access to the bundle that is located on the server. And as is specced by dmvarity, we have also verified the random access to the remote bundle. And yeah, from the installation, it's just again very simple. You just call Raoq, install, and then the URL you want to install your bundle from. So yeah, as we lose lip call, this supports basically everything that lip call also supports HTTP version one and two basic authentication HTTPS also with optional recline certificates and custom ATV headers. I don't want to go into details here so you can have a look at this at the slides. The next consequence when you are already able to download things from the internet is that we want to save download bandwidth because download bandwidth is often quite limited or expensive or whatever. So a conventional approach to this is using delta updates. So what we understand as delta updates, you might have heard Stefano's talk earlier that day who also had a similar motivation. Delta updates produce a delta binary or file data between distinct versions. The drawback of this bunch is that you can just incrementally installs. You can just move from one version to the other version with a delta update. And so yeah, we found this a bit inflexible so we want to move between all versions possible. So this is why we have made a concept called drug adaptive updates. The main idea behind this is that you have the original bundle and just add some metadata or generate some metadata that allow an optimized update of the bundle. So keep in mind we have the ability to perform random access. So even if we pack additional content to the bundle we don't have to use it or download it. So how this looks like in the bundle manifest is that we have for each image a list in adaptive equals. Yeah, that is a proposal what the update says it would support. And then what is actually used from this is or depends on the Raoq service used. So let's take for example Raoq 1.8. It's not released yet, but just will be the next version that supports only block hash index adaptive update. So if we have in our list two options here and block hash index is one of them it takes the block hash index as optimization method and everything is fine and here for the application file system it doesn't know the adaptive mechanism so it just uses a conventional update by copying the image. So yeah, we are fully backward compatible for whatever future adaptive update mechanism we add there. So already haven't spoken of it. So this is a basic concept how adaptive block hash index update works. So we split up the update image into several chunks of equal size and for each chunk we create the hash describing the data and then we put the hash sequentially in a file so that we in the end have a file that unambiguously describes the data inside the update. And you can do this for the update itself and you can do this also for your active partition where you currently run from and you can do this also for your inactive slot of partition where you currently run from and then an update works when using this adaptive method by first of all only downloading this hash index table and then what Raoq does it goes sequentially through this hash index table and searches first of all in the local hash indexes here if the data, if the hash is available there and if it's available there it copies the data from the reference location on the block device to our slot that we want to update. If it's not available only in this case then we have to fetch it from the remote bundle. We make this again with a random access to the bundle with HTTP streaming capabilities. So this works best with data as these are fixed blocks here. It works best with data which is aligned to 4K blocks so for example in X4 file system where all the files start at 4K offsets. There it is very good and quite easy optimization for the download process. So similar to this and this is also part of the outlook already because it's not yet implemented is the tree rsync checksum method. It basically does the same but uses file checksums. So rsync at least in patched versions is capable of generating and using checksums of files. So you can store these in either extended attributes or in a separate file and then you can call the rsync command and just do the same as we already did. You say, okay, I take the active slot as reference. I take the inactive slot which is also some update target as reference and then I take the remote bundle as a search reference and then I just perform an rsync update and if it files locally then it fetches them first and only the files that are not available locally either inactive or inactive petition have to be fetched from the remote server. And we could also then combine this with the former delta approach where I said it's not so optimal but we can for common versions add distinct delta versions to the bundle and then if the version we're going to install from or to match a delta in the bundle then the adaptive method can just use this delta instead of using the original image. So a next feature based on the dmvarity bundle format is encryption. Encryption was often required by our customers and by the community but not easy to implement with original bundle format. So the motivation is quite clear. Sometimes you have sensitive data in your updates or sensitive application or some sort of intellectual property. So this you want to protect and normally the generation of the updates so the generation of the update artifacts happens inside your boot server or inside the CI and the actual encryption or the handling of different individual devices and the deployment happens in a different entity. So the encryption process in Raug is split up in two parts. One, which is what the CI or the boot server emits it's a payload encryption and then you have the individual description for the individual recipients based on whatever devices are out in the field. So yeah, we use another kernel device method now. So this time dmcrypt, the one that you know from your laptop and yeah, this is just quite simple. So we don't use it for the generation, for the generation we just generate a random key and then it's symmetric photography. So we split the image into blocks of equal size and encrypt each block individually and concatenate them again together to an encrypted image. So for the decryption then dmcrypt comes into play so when you read a block from the device map of what happens hidden by dmcrypt is that it reads the exact block from the encrypted image and uses the symmetric key for decrypting it and providing you the decrypted version of it. So there's not much magic involved. Here's a two-step process. So in the manifest you see it's again only the difference that there is format equals script and what this does is creating a squash file system as we have before, had before but now it also encrypts the squash file system and the encrypt key, the symmetric one is stored inside the manifest just what we already did with our root hash for the invariant. And yeah, with this we have an encrypted payload but we have an here the encrypt key unprotected currently. So then the next step when we want to encrypt for the individual recipients we call rock encrypt and there you can choose if you have one recipient or if you have 10,000 of recipients and what it does is just taking the signature CMS that we have before inside an encryption CMS so we basically encrypt the entire signature that we had before and then we can with CMS yeah just add hundreds of thousands of individual recipients there. It's only limited because sometimes your CMS structure grows quite large so, but in theory you can have very many different recipients. On the device side then there's many just one thing needed you have to have in your system configuration file the one that is on the target you have to specify the key and the certificate required to decrypt the update. It can be a file, it can also be a PKCS 11 URI. And as we combine here dm-varity with dm-crypt we then have authenticated encryption of the bundle and as this everything works block-based with it's also compatible with streaming. So you can just stream an encrypted bundle through a partition. So yeah, the typically use cases one could use is just yeah, the simple one is a shared private or a shared encryption key. So one to rule them all this is yeah, basic security I would say. So if the key is compromised then it's compromised for all devices out there. But you don't have to secure your key so much. A bit better approach is to use group keys so that you form different groups of your devices and assign them each a different key then if one key is compromised it's only the group of devices that gets compromised. The ideal case would be to use per device keys. Then it's possible to revoke them per device and but then if you have individual keys you also have to take care of securing them individually. So putting them into a trusted platform module and HSM or a trusted execution environment or whatever. So a short outlook of future things that we have planned and revoked and things that happen in the community as promised. So we have some metadata in our update manifest. So the compatible is mandatory but the rest is optional. So you can put a description and you can put a version in the build idea or something like this. But for some use cases this is insufficient when you want to make some high level decisions about whether to use this update or not or what's the purpose of it. So there was a request to have the ability to add additional metadata. So one of the future versions of RUG then will support metadata sections that are not interpreted by RUG in any way. They are just forwarded via the Dboss API or via RUG Info and then allow the vendor to put additional information in and to evaluate these information and do whatever he wants with it. Another thing is installation is free or event looking. So what we have so far in RUG is a status file that is here on the left side which gives some basic information about what was installed last on the device. So what kind of bundle, when was it installed, when was it activated. But yeah, there was a request to have more and configurable locking of what happens inside RUG and also to have a history of all installation that happened in the lifetime of the update service or of the device. So this is also something that will be done in the next or the release after I guess. And another feature is lifecycle handling. So currently the scope of RUG is mainly just the current installation. So it is good in installing things. And after reboot, you can mark your just installed system is good but there's no high level perspective and tracing between what update we installed and which update after a reboot we mark successfully. It's not a big issue but it's nice if you have a deployment server that really wants to know if the update he just installed is the running one. So the idea is to have what we would call lifecycle handling. So we generate an ID when starting the installation and then we use this ID to a file or whatever to trace the entire installation procedure until the reboot of the system and then we can emit via deboss for example successful installation upon the next boot. And what's also possible or should be possible with this is then to add a capability of adding confirmation. So like when a user has to approve an update or something like this also via the idea of the deboss interface. And there are some more things to do so having multiple signers for the eight artifact or M of N signatures, this is well supported by both OpenSSL and the CMS structure but it has to be implemented either by the community or project driven. We've had a long history of requests about application or container updates. So I want to update my application without having to update the root file system each time. So yeah, we have developed an initial concept for this. It would exceed the scope of this presentation but for those who are interested in it is available on this GitHub issue and other nice things are applied like streaming upload from browser so that you can click in your browser, upload a bundle and then Raoq interacts directly with the browser to download and install it without intermediate storage required or to have a simple deployment server. There are things like cockpit out there but these are quite complicated. For most use cases, a simple deployment server would be sufficient. So yeah, this is something we would also like to add not in the Raoqore but in the Raoq organization. Speaking of it, for those who know I want to incorporate with a cockpit which is one of very few open source deployment servers available. There's the Raoq cockpit updater which is a project that was initially developed by Prevas and moved into the Raoq organization in 2020 and we did a lot of refactoring, fixing, cleanup and adding new features and yeah, the current lease is 1.2 and it's a quite handy way of interacting with the cockpit deployment server via the REST API on one side and with Raoq, the deepest API of Raoq on the other side. Another thing I can highly recommend is the Meta Raoq community layer that was initialized and is still maintained by Leon Anavi, thanks to him if he can hear us. And yeah, it's basically a Yokto or BitBake layer collection for example integrations of different platforms. So for example, we have an example integration for QM Obase platform also for Raspberry Pi, Sunshi and Tigra and yeah, it's just an example of how to use Raoq but it's a very good starting point so we also have built tutorials on top of the QM O platform because it's a starting point where you don't need any hardware and yeah, so feel free to use it and the link is below left here in the slides and also contributions for new platforms or extending the current support is highly recommended. And a look into the community and yeah, it's always for a project hard to know where or for an open source project at least, hard to know where your software is actually used. So yeah, I have to say, we were quite a bit proud to hear that Raoq is the update service used on the VALS Steam Deck and that's used together with the eSync for those who know at least the eSync, this is the go variant of the eSync, the content chunking tool from SystemD Universe that Raoq also has support for and Collabora who did the integration of Raoq and the eSync for the Steam Deck also contributed these patches back and I think they are already merged now or finally merged and will be in the next release and other projects that use Raoq or that we know that use Raoq this home assisting operation system where they have a build route based system where they use Raoq to update us and the Onyra distributed platform project also uses Raoq as part of its SysOTA update mechanism. So I wonder why I am in time because I didn't manage to make it in 40 minutes ever. So thank you for your attention and yeah, I think we still have time for questions and for those who can't ask anything right now there's also Raoq IRC or Matrix channel available where we are happy to discuss with you. Thank you. All clear, fine. Or too tired. One problem with the Delta update is the original image. It's hard to know that the original image that you're trying to update is broken and normally when you do an update you're sure that the new thing that you wrote is good. But with the Delta update you kind of rely on the original is still okay and I'm not overwriting it. But here, because you're actually checking the hashes of the original you're still sure that what you write is 100% okay. So I just want to indicate this that this is a very good way of making sure that Delta updates are still reliable. So the data that I was talking about generated on the server side. So I would assume that we know that the bundle on the server side is okay and what is written on the target side is still or can be understood as a full image update. Yeah, this is what you mentioned. Okay. I mean, I don't have. I maybe have too many. So I think one thing I noticed with your system so it looks really cool but the bundle is encrypted or has de-imparity, right? But then once you install it on the target it's not encrypted and you don't have de-imparity, right? You would have to then basically encrypt the partition or have a verity blob or whatever on top inside the bundle. So the bundle is encrypted and it uses de-imparity implementation of the target. And... But once you install it on the flash it's no longer encrypted. No, it's not the intention to have it encrypted it's a different problem to solve. So if you want to have authenticated boot or an encrypted partition that is a problem or thought you're gonna know to what we do during transportation. So this is only for the transportation part. You can do de-imparity on your target. Then you just have the file system for example or a block image which the hash pre-appended you just install it as a normal image this is transparent to what Raoq does and the same would apply if using encryption. So yeah, then it depends on which type of encryption you use if it's done by the file system layer or the block layer, so yeah. Okay, all right, thanks. So with the delta updates being calculated at the block level, so beneath mksquashfs and all the compression, practically speaking what numbers of savings do you see because typically if there's just one file that gets changed one byte with all the compression going on it's gonna be more than one byte being changed. So what are the real numbers that you see in practice when not much files are changed but how that impacts the amount of delta updates that have to be done? We have not massively tested. We have tested this in some cases where we changed only our parts of the file system. It's not as ideal as delta updates are it's not as ideal as for example the CAsync is with it synchronizing hashes but if you tune your squashfs and your x45 system and so on correctly then the saving is yeah, quite notable I would say not to say massively. Thank you. Yeah, we're done. Thanks a lot.