 Okay, welcome welcome everybody Okay, so let's start I Put up this picture It's actually not advertising I would have brought it much more to the point. This is our legacy system They are about I don't know exactly. I think six or seven years old. They are arm 5 they have This particular has one gigabyte of RAM and 64 megabyte of flash so they're rather limited devices We have a kernel that was Never changed since these devices have been produced so it's probably not a surprise that they have security issues and Nowadays we want to put them 24 seven to the internet. So this is definitely a problem to us so we had to come up with solution to provide a kernel quickly and Extend also think about how we could upgrade in a safe way Yeah, one more thing is as you can see this is a fuse box that you have in your room and These devices are not easy accessible. So as an end user it's not easy to actually swap devices one for the other and If one of the devices fail, we cannot just ask end users to do this on themselves This is kind of a it's an illustration for a physical physical barrier We have so and some of the decisions that we present are actually based on that we don't want to lose a device Okay, so this is not like Prior work. I just realized that there are a lot of tasks a lot of talks also about Software upgrading and think there will be other tasks also concerned with it, but this is like basically what I think how Software update has been done and a little bit also based on real events so the the first thing is You have an image and the image contains a kernel and an init rum of desk and At the end you have the actual root FS So you download it within your application image and then you just kx act on to The update and then within the update You don't need your actual root FS anymore. It's it's unmounted You just running from rum and the init rd Then just takes the payload the new image and flashes over the old one Which is quite elegant way to do it But of course you have to risk that if you lose power Then you're all root FS is gone and because to your run this gun as well. You're basically have Break your device. So there is a Fallback mechanism you either push a button or in case there is no root FS available It will automatically look for one of your USB stick Second solution The first one is what we actually use till today. It's rolling update. It's what you know from regular desktop distributions like Deviant testing or unstable where you actually never really reinstall you just keep going and It has the advantage that You don't have to you raise the root FS and write it from scratch. You only ever Switch small parts of it like if you compare it to car It's like you change the oil or change tires change fuel pump There is a problem if you go for a long time and you ever do some kind of a legacy like you move location of files You have to carry this along with you all the time Because you never know where you and uses and users really start off and this also means there are a lot of upgrade buffs that you have to consider and In fact, you probably cannot ever test them all and if you do more underlying changes then And This is very hard to do if you go with package-based updates Like if you do for instance O package It's probably not used in that way as we use it and we had lots of troubles with it So the package might be the better option, but to switch from O package to the package in the rolling system is like I wouldn't do that The next one is What I've seen maybe an hundred it's done that way that you have a special you boot you download your image into a spare part Edition and then through some mechanism from May locks You tell the U boot that there is a router fast and it should actually upgrade it So this is a pretty elegant one You might end up that you have to while you're dependent on you would actually So you might be able to do this with boot scripts if you're familiar with boot scripts So that this is not actually part of the you boot Itself, but it's outside, but it might be that the end you have to change it and One more thing. Yes, it happened that the root of acid you downloaded into partition gets corrupted Through well, it's rather extreme case, but Nowadays you have a flash translation layers and it might be that when you write the root of s It might be that a few blocks get reshuffled They might corrupt your root of s it might be remote, but it's not under your control and We actually have device that just returned to us and it came back with like the kernel corrupted And nobody's ever touching the kernel as I said, we we are very conservative in that. So either some of the users Corrupted the kernel. I don't think so or it's really a flash translation layer issue that happened So if you lose your root of s Then you only are back with the you boot and you have no way to get a new and you upgrade Maybe you can do a backup with the USB stick Yeah Well in that particular picture that I showed you here I can't enter you a USB stick either you have to have a special form factor. It's not a standard one so So what is danger when you actually update it? Well for us it's Power fail so if in the process of writing you lose power and you don't complete it But you already have destroyed it. That's the dangerous part and We cannot control this it's something That we can't aware but even even the user if he takes care It might happen at some times for instance where I live they never announced like service windows They may be announced it for hot water But they never announced it for phone or for electricity. So if they actually have to do it that should do it and That might just happen from the outside and of course the risk is bigger if you update takes longer Good example is a you boot like you boot is very small and it takes only a split second to actually write it so the chance that you that you actually Break your device through you boot upgrade. I would think it's it's rather small Although people are typically very afraid of it. It's probably something that never ever fails really On the opposite if you root a fast it contains like megabytes or tens of megabytes Then it takes seconds So the risk is just longer because linearly increases and the same thing is if you do very lot of updates And if you update your reboot on every every system upgrade well then of course it might happen as well because yeah So so what we present here is Yeah It's not revolutionary is we take any drum of s and we want to use kx at the top of it So why kx we come to this later? but we want to have an in a tremendous disc because We don't want to work on you boot. We don't have one to have our modifications We don't want to modify the you boot underlying it For several reasons one of them is if you want to change something we need to change you boot as well So which is all with like a single point of failure the second thing is we have different platforms and None of them have the same you boot and it would be it is early from us to believe that we can have Ever the chance to when we get a new platform to actually rebase you boot So that they are all in sync or actually change it So we rather try to abstract the you boot and add a layer on top so we don't have to actually work for it Next step is the logic to perform upgrades. So we come next we come to you to the to our platforms and Some of them are very limited to do an upgrade so the logic to do the upgrade is is It's beyond score what you'd like to do in you boot and would take a long time to implement in it So we rather go into a POSIX environment where we have all the tools of your know-of and The left thing yeah, it's self-sabely of gradable Yeah, that that will also be topic later. So these are our platforms that we have they are like 10 years back was like arm 5 64 of ram The the first one used to have an SD card, but that didn't turn out to be such a good decision The SD cards start failing on us So we're basically assumed that these devices only have to you under the 56 megabyte of internal flash then there was the second try which was like Improvement of it. We went back to a bigger internal flash And the last one is the IMX6 where basically you don't have any limit by today's standard I mean these are just PCs without keyboard and And summary of it is because of the partition we cannot we don't have a partition that we can download it in advance we do have to repentition it and So we have to come up with something else Yes so We go with the smallest Nominator, which is the first one the first one is 256 megabyte We are not able to download the root of s at all so we have to download it on the fly and flash it on the fly and It's not even able to hold it in the RAM because even the RAM is too small so we came up with this Kind of pipeline which is like shell scripting on steroids But it actually works. It's It proves that this concept of Small modules like like at the very beginning of UNIX still works today I suppose it's rather easy to implement this in a short time So I guess I don't have to go through this. I think you can you're familiar with pipelines and I think I can demo it Although the network just broke down on me before so let's hope so If you want to go reflash request We just place this kind of file onto a root FS in the special folder And the folder is just send the UNIX path. It's one little rescue system So we just move it down there and reboot Should keep talking something just that it's not silent so Yeah, we're missing spot path. We turned off the busy box Now we will get this is a boot script that you see and now there should be the prompt to enter the rescue system Which yeah, I still interrupted the normal boot up so to give more More output so now we try to network because that was the problem before I just I don't know Like the demo effect waiting to get in and in the figures and I start to get nervous What's not? Okay, okay So and then I restart this is the normal thing if you get in this will be to come on to this run And if I enter it Then it will mount the root FS and it says that it found the root FS the request and It's seen the two links and it will download these images The first one was the signature as you see and now we start in the first in the in the first stream where we Or we do the verification. Well, why do we verify? Well, we haven't seen the image before it might be that the server broke down or that the signature is not valid So we don't we cannot just start flashing immediately Another issue is that for UBI update volume You actually need to know the size of the image because if you pass the burn command UBI update volume You already need to know the size of it. So let's be do the word count in the pipe before and Then well, yeah, we see a progress bar and it's probably doing something So I think we leave it there and we just go back to to the demo and then check back later so We're to the boot order so we decided to be always boot into the rescue system first and That was done for a simple reason that our devices are running like for days weeks. We don't turn them off. So They might run for long long time. So if you would only ever enable it when you do an upgrade We should not not be on every reboot Then it might not be well tested So we actually force people to use it all the time that means we force it on ourselves on our testers and even on customers So if there is a problem, we will see it. It's it's merely a design decision but You could very well just disable it by removing the boot script and then the u-bit will just Mount the first partition will not find the boot script and roll over to the second one So you could enable it just on Whenever needed Okay, but here we enable it all the time It also comes back later when we do an upgrade So if you want to upgrade the rescue system, we just remove the boot script first then we replace all the files I want to assure all the files are good and in place. We put it back so that way we can actually Make a safe upgrade of the rescue system itself if you later come and want to add more features and we do want It can always boot just with the root of s itself because it was used from the first day it was used that way And this is the the logic that you've seen and Yes, it was resizing so it's coming up So what you've seen when they come in it master root of s it checks for the special folder if there is a really flesh request If it's a story it goes reflashing. There is none. It will select the kernel. This is the the second way we use it we use it as a Yeah, as a u-boot as a boot loader itself to actually do kernel upgrades without doing root of s upgrades And if there is no kernel present well, then we have a default link that is built into the rescue system And it will just reflash the root of s from there And if anything of that fails it will reboot and that thing starts over and over and over again until eventually We fix the server and the link becomes valid So that way we hope this will be Not breakable in this way So this is the upgrade way that I wanted to To say There is not much to say other than that you would is able to start either way So if you have currently no rescue system They can go into the root of s and then fix rescue from the root of s If you don't have the root of s you can go into the rescue system and rescue will download another root of s So it's enough to have just one. So he lost one. I still have the other one the system is not lost so Within the root of s when you download the root of s the first thing it did it updated the rescue system So rescue system and root of s are back in the same At the same point. So did you know that they both have the latest version that we ship? Okay, I Probably will not demo that I could still do you have time but It might not be that interesting But let's do it later if there is still time left Why KXack? As I told we had the problem that our kernel was too old and we needed to update it as soon as possible We're still up in rolling updates. So we cannot do it in one step. So we did an intermediate step We just shipped the new kernel using a package So for those interested we have packages for Yachto and we tried once to actually upstream it So if there is still interest even from from other people and maybe from Yachto itself, we'll try again to actually provide those so What it does it will mount the root of s and we go to a special folder called slash boot slash entries and Figure out if there are some kernels that and pick the one which the highest priority We did not want to use it as a standard in it already we actually do a switch or their people route and Keep the kernel that you have in the rescue system. We wanted to keep those like separate So it's like it's artificially you could still do it and maybe for some systems It makes sense But to the point now we try to keep this as separate as possible actually Shrink the kernel in the rescue even more and just make it dedicated to this single purpose Even though it is Linux and even though it could also run the root of s. It's just something we decided for now And Okay The the kernel upgrades that's again something we have taken from somewhere else This is we're taking this from the bootloader specification where they use dropping files to do kernel upgrades So that you Well, you don't want to have a big file where you just enter and modify it with set an arc You wanted this to be generated So there is such the folder you know what I just so The form is a very simplified to the original bootloader specification is much more complicated But we don't need that not all that flexibility So we just went with two simplifications first one priority is actually kernel version higher is better We always want the highest one the second one is what you need to know is where is the kernel and where is the DTB and both is checksum so If it in a drama fast it just lists that it checks if the file so there it compares to checksum If any of these things actually go wrong It will just roll over to the next one and hopefully there will always be a next one if there is none Then we reflash it. That's actually the same thing that already showed Okay, I Thought it will take much more time Well, maybe it's a good time to for somebody to ask questions or I still have 28 minutes So is anybody does anybody have a question? Yes, there is a question. Yes, that's that's the hard part like yeah Hard areas are easy if you know, yes, it doesn't know it doesn't that's easy the hard problems are like This intermediate thing. Oh, sorry. The question was like What are you doing if your root of s actually is it's a valid x3? You can actually mount it But when you try, you know, you hand it over to kx hack you start booting during the boot up process Some of the init scripts actually hang fail for some reason you cannot then SSH into it and fix the problem from remote Are you doing that? Yeah, so We have been thinking about this and There is a roadmap and yes, it might be good that When you actually select a kernel or if you on the overall should try to boot the root of s that you Maybe touch a file in our lip rescue system that you actually tried this already once before and then during the init process You should actually say okay This this boot is confirmed so that you don't are trapped in a little very ever-and-ever try the same thing to boot again and Also the same thing about kernels because this is completely like extreme You don't need this if you do root of s upgrades with one kernel in it because then you know the root of s with The kernel is good. It's hardly ever the case that you will add a new kernel without also changing the whole root of s This is exactly this part is exactly great if you do Small changes like if you want to add a new feature You add a new kernel to it and you do a little mistake and then Kernel can actually boot, you know And he did this remote and you don't have access to the console Then this would be great that you actually verify that you tried this boot configuration It didn't it didn't boot so you will not try to discuss and modification for a second time Does this answer your question? Yes, please. Okay. Sorry. I didn't hear you. Okay Yeah, yeah, that's that's the weak spot and I never talk about the root of s and the configuration That's something I and that's actually the reason why we didn't switch to root of s Okay, did the question here was like where do you store it device specific configuration like user configuration all the important data Did you have like statistics? We collect a lot of statistics about How noisy is your power line so that we know if you have to switch in different modulation modes? So there's a lot of data, but I don't do this in here. I treat this as part of the root of s so the root of s has to make like periodic updates and It is the root of s that actually triggers a reflash so it is in full control so the fact is there is this partition that we have and Yeah, we store it there and then during boot up of the first time you restore that all configuration Okay Any other yes back there? So repeat the question What if between verification and actual burning the something on the server has changed so for instance somebody changed your root of s or What we did do well I ask back. What would you do in that case? There is little you can do I mean if somebody takes over the server what you definitely will not do is will we'll abort flashing We will like corrupt the root of s and now it's a it's a brick device And whenever you apply power it will continue to try if the server now has valid update or not And if an attacker is able to actually fake the signature. Well, then you lost anyway Yes Yes, the first one might download a completely different image. That's correct. Okay Yes, and for the second it gives another one. Yes, but the download the signature only one Yes, that's right But you see that the GPG runs also in the second run and the download the signature only once and the same for the checksum So at the second run has to to match a GPG in the shot some So if this happens something extra to be over the server, we will detect it and we invalidate the root of s So in that that's the safer thing you can do and then you just keep trying trying trying Eventually, we will figure it out and fix the server so Okay, any other question? Otherwise, I continue with the unit tests Because although most people think these are the least important. They actually are In this project It's very annoying to do it because you have to build the in the drum of s you have to copy down the system Then you have to have to boot and it's only in the drum of s any change you do is lost on the next boot So it is very time-consuming and having good unit tests really speeds up to the programming I Can I can show you just the I will show them Where is my mouse? so, yeah So it's a lot of unit tests. I just want to show we really take this serious. So I think it's like, I don't know and We test actually we start by my writing the test and then we start to add the functionality Because it's too cumbersome to develop in on the target itself So the way that the testing works or that we do the testing Is since it only shell script and it's not really a C program even if C program it might be difficult to do We actually use a root of a root to trail and that we mock up So that the boot the boot program itself does very little it does everything underneath of standard Unix commands The most important of them is like we get and mount and If you're able to kind of mock up those then your actual boot logic can run on top of those it will not notice it so Yeah, and then you what you actually do is all of these mocked up versions Just emit all the commands including all the flags and then you capture those and keep it in a safe place It's what you expect a good run should look like and Every test that you've seen before it just exactly this it just creates This kind of fake fake sister it goes into this folder and then runs our boot system command that you've seen before It produces a run. We take that run and compare it to you if it's the same thing. It probably Nothing has changed. You didn't break anything serious and for the most time. Well, actually I Can't remember a case where we actually failed on it on the system What you still have to do is you're using busy box So it might be that you need to enable special Special commands and not all the flags are supported as you expect on a desktop But other than that, this was very convenient to do I Could go on with that That's I still have 20 minutes. Well, okay It's funny to tab over we don't here we go So and then I will just continue and of course it should produce an error There it is And then I can go in and fix it That's actually how development happens and it's very it's very convenient. It's that's what you know You know it like the right side of your pocket. It's just shell scripting That's what it would you do all the time and as of the mind It's more comfortable to do this than actually extend you boot. You're much faster. Also. It's just shell script Is there any question to this? No more question. Yeah, and the second run you destroy the root of s So if the signature doesn't match, it's better to find it out in the first one because then you didn't destroy the root of s And the minor reason is that something like you be update volume needs to know the size of the image So you need to download it to actually know how big it is so you can pass it to the UBI update volume So I think the first one is very much needed Also, because you just want to rule out to take an image that is corrupted because that way you could exactly do what you Expected you could to make you know, I will you could destroy a device just somehow just Tricking the servers with even if they if the thing is wrong You could just take away the server from the service from the user, which is an attack itself. Okay? So we'll be short break Okay, so that was the demo Yeah These are the resources so currently there is not much documentation. So it's probably mostly this demo and we have some wiki about how to use it and of course there is The repository that we have and there are also Yachto recipes how we use it like how we wrap up the whole thing And you might want to look also in the root of s there is also for the rescue system where you can just generate those tools easily Yeah, okay, so this is what we like to have for next time The first thing that we try to avoid is having two runs So well, this is for the smallest system that we have that doesn't even have enough RAM to to keep it in memory Already the second one, which is ubi Already is like on the brink that it can keep it in run So we want to extend the pipeline so it also keeps it in in a RAM disk And if that still verifies at the end of the first run We rather take that than moving the second one and that one that can later extend it that If the root of s already is big enough that it can hold the update Then we take it and copy it into round this can start from there so there is no streaming necessary at all and there is like a Dynamics all over between the between the different platforms Then there is the handshake that has been asked before So this is exactly for things that only half broken not fully broken But it's also question how you won't actually define that something really worked Well, is it if the SSH is working or if you have a HTTP what what defines that it really works? And yes, there is space optimization. I Haven't shown it, but I Think it's about two or three megabytes. Well, the most the biggest part is actually ellipsey and then it's GPT Yeah, okay Yes, so it's three megabytes is the It's the in drum of s and the kernel we didn't tune the kernel The kernel is actually much bigger than the inner drum of the south and now we can go into the rescue Then I can just show the biggest biggest issue there. So Any more questions? Yeah Let me just hit the front No, so I will not work. Okay. Are you a question? What do you mean? Okay? That's a very good point Yes, you got me there No, we don't that's that's true. It's a good question. Are you having a talk later on? No Okay Yes, it's true So if you go back to this here You see that we do the GPT We we take it from the we get but if down here in the UV updates something goes wrong if it goes wrong That's perfect because then we have the error code with a little something went wrong But what we don't do is we run like read it back through Maybe cat it and then like cut it off So do you have to write size and then feed that back to GPT to actually verify that what you wrote this what you want No, we don't do that. That's yeah, good catch Which come back actually when do we verify that the boot actually works if that actually works We would drop back into the shelf for the next time and try again Well, that makes sense to try again This very little you can do Well, if you expand it at some point you end up with the halting problem, which is unsolvable. Oh Check sums over the whole device. Yes. I know trip wire and embedded it will probably delay the boot process Okay You mean you have a list of good checksum of every binary you have on the root of fs and actually compare if those really match Hmm and how would that be different if I just take the raw device read back to raw device and check some there? It's probably the same. Okay We actually we have we expanded later, but we already go into development. Is there anybody else want to have a question? So I will finish just a roadmap Yeah, roadmap. Yes, we still have patches to Yachto. Is anybody from Yachto in here? Nobody okay We try to add multi kernel support into Yachto and it seems like Yachto at Well, not at the core, but maybe it's just an assumption. There is only ever one kernel inside So a lot of cup a lot of paths for the kernel source are actually hard coded And I had a hard time to to expand it to compile two kernels in Yachta and I have dispatches And I tried to mainline it but it seems like there is not really that much interest in having two kernels Produce with Yachto, maybe because it's such a remote feature. Yes, please Yeah, oh You wouldn't be interested. Okay What's what's else? Yeah, I communicate or it's black So if the GPG failed or if the checks and failed you probably want to give back through the fast some kind of Feedback so that they can make a nice use user interface out of it. We working on that, but well It could definitely be better and now that it's still have 10 minutes time. I thought there was something more But Pardon, I get closer. So, okay, you go for the unit tests Yes, I looked at the rocket which is using Which is using a KVM and KU net as fallbacks for it. Yes, it's something Something nice to have but How would you actually make a fake mount or a failed we get? At some points you have to force it into errors. It's nice to have it like a KVM or Kimu like It's definitely better than that. Well, this is just change Change route. It's even worse than that. We're not even using change route because we don't copy all the desktop over it We just prepend every path of like this or we over at the path that we take our utilities first and then use the system utilities So that one would make it like more real like more close to what you actually have But I was thinking about how I could force it into all these error cases Like we got fails on the first fails on the second fails on the third so that actually go through all the thing It doesn't help me in that thing One second for those who don't hear the question was like Have you ever planned to extend the unit tests so that you use a real key? KVM or KVM so that you really run an in a relation and yeah my answer you heard it. Okay, so your next question. Okay Yes, so Question like so I could also inject areas on the KVM layer, but I'm not sure how I could do that I it sounds to me like I would start to write kernel modules just to make the testing So there is some support in Q me that they could automate this or there is already mechanism in place that I can Yes, so you're familiar with it So maybe kind of a chaff later on I was thinking about going to Q me and KVM But then I felt like well It's it's something that's a lot of effort to do in the end I didn't really see the benefit of doing it because this is just shell script and I created very very crazy crazy top line but other than that it's it's not It doesn't give me any any advantage So I rather stick with which it's really simple and this is really simple and it's easy for everybody to understand and work and Set up because you don't have to set up. You just go in there. It's a brand scripts. There you go. Yeah Okay, and more questions Yeah, like there. Okay. So can you repeat the question? Oh, do I also encrypt the root FS? who I Think I know where you're heading like about key management so question was like do you also encrypt the root FS and no we don't the root FS is unencrypted on it and Yes encryption the root FS Well, it is signed while the image is signed Yes, well if an attacker ever gets route he can do what he wants anyway But you mean that he can modify the root FS so that he even survives a reboot Yeah Well, no, we don't do it I haven't done a really security Like estimate and I'm not sure that this is the most probable attack to do it Because if you are able to actually modify Do you modification the root of s as you've seen it's very easy also to modify the rescue system so you can actually hide yourself inside of the rescue system and Actually do some things on top of there. So even if you verified it or you can turn off the verification so How you want to protect it you would you have to start actually from the chip Then you'd have a secure a secure bootloader and they have to go into secure root FS So that you can actually just rule out that anybody who ever had to root FS will not modify anything in the past Yeah, but that means okay So every component needs to be signed so that you can actually trust you can pass the trust from the first to the second component Well, what do you have to start with the chip so when the chip comes out They have to start in secure mode and then you you would have to be secure and then rescue has to be secure. So It's not something we expect or is another goal to really close it down as much But yeah, that will be a lot of effort to actually do that and especially since we have several platforms We'd have to do it on each of them individually Yes, we actually do GPG upgrades. Oh question. Okay, you mean on the fly transparently? Okay, there was a note. What was the name of it? Okay. It's just a not a question, but the remark so there is a integrity measurement Infrastructure in Linux, which does it on the fly so it seems like it whenever it sees the first file It continues it computes a checksum and then whenever that changes it will Kind of raise a flag something went wrong or something changed. So if one of your executables changed Okay, okay. Okay. It's so advertising for you talk. When is your talk tomorrow? It's 1400. So there is a talk about transparent security measurements, okay And I think we can close it here unless somebody else want to ask something. Yes And no file system. That's probably the independent fight No other reason that we used to it. It's not really what would you suppose not with not with x3 But as I told you we before had the problem with the rocker on the flash. So we have problems with it But not with x3, but we're not using the platform so long So that might be another topic, but with this it's actually easy to to swap file systems You've seen the first platform still uses chase us to which is to get with the limited the ROM It should that's very bad We should actually switch over to UBI and with this we can do so if you have today x3 tomorrow We can have riser not riser, but by the refs probably or what would you suppose? By the refs any other question? No, I have I could go into our security. So we we have The the key that is inside of the rescue system there is a key rotation you you need to think about how to do a key rotation and It's actually a tricky thing if you if you go back in this kind of trust model It's a dangerous thing to do and we actually sidestep that and we're using I don't have it with me So we just using a secure token card and then we we plan to we don't actually have it yet, but we Put it in place so do you actually sign it with a secure card that should avoid to have too many key rotations? Or you're still able to do a key rotation because we can switch to rescue system Okay, so thanks for your attention