 Welcome back to this session. This time I'll be talking about transactional updates with B3FS. My name is Ignat. I'm working for uh, SUSE as a research engineer and I'm the maintainer of a tool called transactional update which is implementing basically what I'll be talking about now. So, the question is, um, why do we need a tool or a mechanism like that at all? Because, I mean, most distributions will not claim that the update mechanism is not reliable. I mean, nobody would claim that. Um, but in reality, we see that things can go wrong. Still. Just a second. Yeah. Why? Because, um, packages may be failing to fail during installation. Uh, just take the example, uh, you are installing an RPM package which has a post script. The post script will fail and then you are in a situation where you don't know what state your system is currently in because the system has modified, uh, the package has modified something, but you don't know how far it got and what it actually modified. Uh, that leads to several people having a fear of distribution updates. We often hear that we don't want to update, keep the system running as it is because things may break and yeah, we don't want to have that. Um, that's especially true if you're, for example, if you are thinking about a desktop system, uh, about diverse updates or if you are updating your complete desktop environment, how would do you do that live in a currently running session? Because you're updating yourself, the systems you're currently using, uh, live while you are using them. So that's also not an ideal, uh, uh, ideal way of doing updates. Um, if your package is providing a service, you'll often get, um, restarts of that specific service, which is also maybe not what you want if you're doing that in a production system because the package is controlling when you, when it's going to reboot your things. Um, and of course, if you're doing that as a maintenance, in a maintenance window, um, you have quite long down times until that update has finally finished performing all of its operations, which may also not be ideal. If you're going one step further and thinking about atomic systems or transactional systems, uh, we had them a few times, uh, during this conference already, uh, then things are even worse because you want to automate a lot of things as efficiently as possible. You don't want to have a lot of down times and automation is really becoming to start getting difficult if you can't know what state your system is currently in. So you don't know how can you get out of a mess if the mess started. Um, also, one point I also wrote down, uh, for embedded systems, uh, you can't really afford it to break the system of the user. If you break it, then you really have a problem. So, um, that's why transactional updates are there and transactional updates are not a concept we invented. Transactional updates can, uh, over various distributions, uh, be defined as the following in, uh, transactional update is an update that is atomic. So it means that it's either fully applied or it's not applied at all. Um, and it must not influence the currently running system. And the second point is, uh, it has to, there has to be some option to roll it back. If you realize it didn't do what it, what it was expected to do, you have to be able to roll it back and restore your previous state. Um, this slide contains a lot of logos because basically that's all, all systems which are mainly implementing, uh, things like that. Um, and most systems have some things in common. First of all, uh, they are having those transactional updates I'll be talking about, but you can see, uh, some more similarities. Uh, most of them share root file system, read only root file system. So you can't actually write back to the root file system. Um, most of them are either intended to be used in cloud or image environments or in embedded systems. Uh, they are usually consisting of a minimal base system. Um, some of them have automatic reboots and updates. And some of them also provide integrity, uh, protection. So, um, those are the basic concepts that common, but the implementation of the distributions is completely different. What I'll be showing now is the way B is open SUSE in that case have implemented it because, uh, that's one way of implementing it. And it's using BDRIF S as one of the main components. I guess most of you know SUSE is one of those, uh, distributions which is using BDRIF S by default. So, uh, one question for the audience, uh, who is aware of how open SUSE is working? Who is using, uh, SUSE or open SUSE systems? Okay. Only one of you. So, um, yeah, let me get a brief introduction on, on to, uh, how systems are working. Uh, for package management, we have SIPA. That's basically apt-get or, um, uh, DNF in Fedora, um, which is all doing all the packaging work. Uh, we have SNAPER. SNAPER is a snapshot creation tool. Um, we are using BDRIF S. So, uh, SNAPER is one of BDRIF S, one of the really cool features of BDRIF S. And SNAPER is automating that. So, whenever you are updating a package, you will also get a snapshot before and after that update. So, you can just roll back if anything goes wrong. I'll go into detail in a few slides. And, of course, one of the most important ingredients into that variation of transactional updates is using a, let's say a file system which is able to do snapshots. In our case, it's BDRIF S. In theory, uh, the same mechanism could be adapted for other file systems, but, uh, there are a few caveats if you really wanted to do that. So, how does that actually look like if you're using an open, uh, uh, SUSE distribution and just take all the default values during installation? You will get a system installed and whenever you are updating, as I said, SNAPER will create a pre-snapshot, then do the update within the currently running system. That's snapshot number two. Uh, that's point number two. And then create a post snapshot so you can roll back if you manage to destroy a system while it was running. Um, one of the problems, as I said, is all of those operations are done in the currently running system. So, if anything goes wrong, you're stuck, you have to roll back to one of those snapshots. And, yeah, uh, that's usually not what you want to do, uh, deal with, uh, how to get, uh, back from those snapshots. And if you have a look at the definition of what we wanted to achieve, uh, with transactional updates, we can see we have that option to be, uh, to roll back if anything goes wrong, but that atomicity we wanted to achieve is not given with the mechanism we currently have. That slide is not supposed to be in here. Um, third one is, um, with transactional updates, we will change that behavior. With transactional updates, we will still have our currently running system, which is in this blue snapshot you can see here. And we will then just create another snapshot next to it based on this system. Um, the update itself will be performed in that new snapshot. Uh, the currently running system won't even notice an update is performed somewhere else. And, of course, that has a huge advantage. All your system services are just continue, uh, will just continue to run as they are. And, yeah, the update itself will be running on the background. If anything goes wrong, then that new snapshot will just be deleted on the background again. The system will just stay where it is and everything's fine. That's the basic concept. Um, what's happening with the, uh, update actually is successful. Then the new default B3FS snapshot will be set, set to that new snapshot. So whenever your system is rebooting, it will reboot into that new snapshot and use the updated system. Uh, one of the basic elements of atomic systems is that reboot part because you can't guarantee atomicity in a live system or in a running system. That's also one of the things which is common with, um, the other atomic systems. Uh, yeah, that reboot part. Um, we also have a helper tool called Health Checker. Um, if the system was actually rebooted into that new system and anything goes wrong in there, it will try to detect that. For example, you can check for services which, which services which are not running as you want them to be. You can check for files which may not be there. You can just put in a script which will check everything and check if everything is running as it's supposed to be. If it's not, it will just set back the default snapshot to the previous one and reboot the system again and you are back in your previous working snapshot without any manual intervention. Uh, that's the whole idea of it. Uh, we also have a Crab 2 snippet, um, which would be better if it was solved by, um, uh, there's an, we have to do it manually. SD, uh, boot by system D would be able to do that automatically, but we are not using that. Um, if the system does not even pass in an ID, we'll also detect that and also do the rollback from Crab 2 itself. So if you want to just have a quick look at how that actually looks like, um, I'll be using micro s. It's a part of the cubic project, which is doing that on the open suicide. Um, you'll see that it is a read only root files. It has a read only root file system, but still even with a read only root file system, you have to be able to write to a few directories, mainly VAR because obviously it contains, contains variable data. You have to be able to write that, um, and it is C because probably you don't want to use the default configuration all the time and want to change a few bits occasionally. Um, I don't have too much time, so I decided to record a few things and play them as a video. What you can see here, uh, yeah, it's working perfect. What you can see here is a default micro s system which just booted up. I just increased the font size. I hope you can read it, even from the back now. Is it visible somehow? No nodding, a few nodding, perfect. Um, we can see our super command. Like I said, it's the default packaging command, uh, and I've just listed the default repositories. You can see those are the default tumbleweed repositories, tumbleweed is the rolling release distribution of OpenSUSE and we can just use super search to actually search for some packages. You can't use super anymore as soon as the actual file system would be modified. In this example, I'm trying to installing, to install something and that obviously breaks. We'll get a warning that this is a transactional system and we have to use transactional update to modify such a system. We can also confirm that by just touching a random file, which simply won't work with a message read-only file system. So let's disassemble that system a bit so we can see how such a system with a B3FS file system layout is actually working. I just typed in find mount here. Uh, find mount with B3FS and overlay. Uh, and we can see a lot of sub volumes. First of all, can I highlight this somehow? No, I can't. First of all, on the top here, we see that the root file system is read-only as intended. We have several more sub volumes here, or better here, um, which are all writable, read-write, which will contain the variable data, which we need. And we could have seen about the, that there was a snapshot directory. I can, we can see that in the B3FS output here, um, which will contain the actual snapshots. We can see someone decided to modify the system quite a few times. We have 10 snapshots already and the snapshots directory contains the previous system snapshots. When typing in Snapper list, I can get a list of all the, uh, changes and we can see a bit of what actually happened. I mean, we could also see that from, uh, uh, interpreting the, the B3FS data, but Snapper will just give a better overview or more readable overview of all the data. And we can see, over here again, sorry, that we got some snapshots. Um, it seems something was wrong with snapshot number seven because it was never used as a base, so someone seems to have rolled it back. Uh, we have also, uh, two times snapshot number eight. And we are currently in snapshot number 10 marked by that star. Yeah. Talking about the default snapshot. Um, if you're having a look at, uh, a conventional root file system, you usually have slash as the root. This is not the case in our system or in our setup, uh, because the root file system is basically one of those snapshots just to make that clear. Now let's install a package. I'll try to install TCP again and we can see some output by transactional update itself and then adjust the regular super output. So we don't have any special dedicated mechanism, uh, like most other distributions to which are having their own packaging format, but it's just using the default RPM files. I'm sorry. I guess I shouldn't step on that. Um, we just, uh, using default RPM packages, which are part of the tumble, open SUSE distribution. Now it's just going to install that. I'm sorry. The network was quite slow on that day and it's finished. Now, uh, we'll try to execute TCP ramp and we will notice we can't, it's not there. Why? Of course, because it was done in that snapshot, which is still not visible. And if we call snap list again, we can see that a new snapshot was created, which will be the default on next boot and we'll just simulate a reward now, just fast forwarding it. And we can see, uh, TCP dump is working now. It's for the virtual machine, so we don't get any traffic here. Transactional update has lots of options. Um, for example, uh, basically it's just replacing all the writable parts of the super command. So we can see here we have dub, we have up, we have all those package commands. Um, and we can roll back as we want and a few special things which also require modification of the root file system. Now the interesting thing is, um, can we somehow debug something or what if I have more complex things to do than just install a basic package? Then we have that transactional update shell command and I've combined it with an installation command again. In this case, we'll just get IO top installed again, not again, we'll get IO top installed and the execution version then stop in a transactional update shell. We will see that in a minute here. We'll know in a transactional update shell and we can use IO top immediately. Why? Because it just CH routed into that new snapshot. So you as an administrator, for example, are able to check that the system is, did everything or the updated everything as it was supposed to do. Now imagine if you have more complex update procedures. You would just open a transactional update shell and then do the update in there. You can check all the, all things you have to do. Um, and only when you're finished, you would exit the shell and only then you would reboot and get all the updates done once by simply rebooting without any downtime in between. It's starting to flash. We can still see something. Now the interesting thing is, what if I install a package that doesn't work? It's not always that obvious that it's called break me. Usually things start to break more subtile like that flashing thing here. Um, in that case, the post script or the prescript even of that script failed. We will just get the message that super was not able to install this successfully because exit code nine and in that case, it means it was a manual abortion. Uh, but, uh, you'll get one code 108 or something like that. If a script has failed in any case, the snapshot will just be removed again and the system is just as before. We won't get a new snapshot. If you have a look at our list of snapshots, we see nothing appeared here. So, um, that's just a cheat sheet in case you want to know all the commands, uh, at once. And that's basically how such a transactional system works. Seems easy, right? Um, I know some of you are distribution developers also developing atomic distributions. So, yeah, it's not as easy as it looks like. So let's have a deeper look at what's actually happening in the background. First of all, we have that ominous war handling. Um, with the war directory, we have one problem. Namely, it can't be part of the snapshot because if you would snapshot that directory, yeah, it would be static and let's assume you have your database in there, then all the data between that snapshot and the next reboot would just be gone. That's not possible. Uh, you usually also store your Docker images in a separate, uh, wasp volume, uh, which is also not something you want to snapshot. So, why especially in a case that it can't be rolled back. Um, and it also mustn't be mounted into the, uh, CH route environment where, where you are updating the system. Why? Because then you would modify the current state of the system, which you also don't want. So, how to deal with that? Uh, the solution is pretty simple. You have to create system D services. Usually if you're using system D, otherwise, um, SSV services, um, which will update or modify things on the next boot if it has to be modified. Imagine a database update, which gets a new database scheme or whatever. Um, then the database itself will be updated during the transactional update run, but the migration of the data will only run, um, during the next system boot. So, yes, there's a slight inconsistency here because not all of the data is roll backable, roll backable, roll. Can we roll it back? Um, can we roll it back because, yeah, uh, you have to have that production data in a separate partition and, uh, care for it, uh, yourself, basically. Uh, we all have the other case of ETC. Um, usually you as an administrator or your configuration management software will have to be able to modify files in ETC. For that reason, uh, ETC is mounted as an overlay file system in OpenSUSE. Um, the data itself is again stored in, in VAR. The data of the overlay file system is stored in VAR, so that's another reason why it can't be, why we can't make snapshots of that. Um, and, what did I want to say? Yeah. And, uh, that snapshot, uh, those snapshots will be stacked. Um, that means, um, if you are updating your system all the time, you'll get several overlay directories in VAR. First we'll start in Valip O, Ovalay snapshot number, something. So if, let's assume your configuration management software changed something, but the update also changed something. Then the newest overlay will actually win. In that case, the, uh, the, um, uh, the overlay of the package update because that will get the new snapshot number. Um, that's, uh, in case of you, in case you visited also the ETC talk this morning, that's one of the reasons why we want to have a separation between, um, package data in ETC and data which the administrator actually changed. So, um, yeah, you wouldn't get conflicts in that case or not so often at least. Uh, but usually it's just working fine. Uh, the newest file, uh, the filing of the newest overlay will win. And basically we are in a consistent state. Uh, in my opinion, that's better than, um, for example, three way mergers, uh, which, uh, for example, uh, core SS doing, uh, because we always have a consistent state in itself. Um, but yeah, that's, that can be discussed probably. Uh, we also have a few other directories, uh, namely opt, which will contain optional data. That's not part of our root file system. If you're installing something in there, it will just be installed immediately. Uh, we have VAR lock. Why? Because you probably still want to lock what's happening during the update. And we will have, uh, boot grab to, uh, mainly on, uh, UFI systems or on bio systems, you have to get, uh, the boot sector somewhere. Uh, you can't roll that back if you updated it once. Um, just to be aware, everything will, uh, apart from those few directories, nothing will be mounted into that, um, update environment, as I'd call it, uh, including SRV, because SRV shouldn't contain any user data or any data done by packages anyway. Um, so yeah, you basically have your core system, which you're updating. And then you can build up on top of that system, for example, with Docker containers or whatever. Um, but you have that core system that will always reliably be able to update itself. Another tool we introduced, uh, is a tool called Reboot Manager by Reboot Manager. Uh, we want to have as much automation as possible. Um, so if a reboot was successful, um, Reboot Manager, uh, sorry, if an update was successful, Reboot Manager will be triggered to reboot the system. Now imagine you're having a whole cluster of, uh, different nodes. And, uh, yeah, usually you don't want to have all the nodes going down at the same time, because accidentally the update was triggered at the same time. So Reboot Manager will have several strategies, for example, by synchronizing via at CD, uh, to synchronize the reboot, um, of the system. Yeah, uh, that's it basically. Um, there are a few pitfalls, which are current limitations, which we are trying to solve, uh, currently. Uh, the next snapshot or the snapshot will always be branched off the currently running system. You can't go back and say, oh, um, I've just rolled back and want, uh, still want to use that new snapshot as a base. That's currently simply not possible. Um, and you should be aware if you're using, ever use, if you should ever be using an open source system, which I can really recommend, uh, then, uh, transactional update is the only mechanism to, uh, update a read-only root file system. You may not need that, but if you do, transactional update is the only way to do that. So, why are there slides? That shouldn't be in here. Okay. Um, so, uh, availability, as I said, you can, uh, use, uh, open source, uh, tumbleweed as a base, uh, or open source, uh, Leap 15 or 15.1. Uh, those will have a transactional server role. Basically, what you get is a conventional trans, uh, uh, open source system, but with a read-only root file system. Uh, we have open source micro s and open source cubic. Those are the, uh, operating systems intended for container use or in the case of micro s for one single purpose. So, you can use the, the base and build upon it, whatever you want. And on the commercial side, we have the Susie Cas platform, which used to use transactional update. It currently doesn't, uh, they are, they free-based it on our new stack. Uh, but it will be, uh, supporting transactional updates again. And we have the Susie Linux Enterprise Server 15.1, which is basically the same as with, uh, open Susie Leap. You will just get a transactional server role there. Uh, if you're interested, all the, um, documentation, all the links to the, uh, sources and so on are on, uh, cubic.opensusie.org. And, yeah, if you're interested in the, the technology in general, uh, and may want to adopt it to your distribution, you'll also find, uh, the necessary links in there, or you can just talk to me or, or my, uh, or a seed channel or whatever. Uh, and, yeah, that's should be it. Do you have any, yeah, lots of questions? Thank you. Uh, several questions, uh, short questions. Uh, first of all, uh, does, uh, does this approach need support from InetramFS? Um, not really. We do have, uh, InetramFS module to, uh, get a list of files which will be conflicting because, as I said, um, not everything is mounted in, uh, in, uh, sorry, in the root file system. And, uh, Etsy may have conflicting files. So we are just comparing those files which timestamps would match, uh, would, um, indicate that there are conflicts and they are just printing out warnings to the syslog. But apart from that, no, there's no requirement for an Inetati component. Okay. Uh, sorry. Uh, next, uh, next thing, uh, you're talking about a grab to a configuration. Do you have integration with, uh, different bootloaders over the, with loaders? No. And, uh, that's, yeah, what, what's the bootloader would have to do is, uh, it would have to be able to boot from the correct B-DRIV-S partition, namely the default partition. If it's able to do that, which will probably be the default, be the default anyway, then you're basically fine. Then everything's just working out of the box. But you use subvolume for root of s, right? We are using sub volumes. Yes. But we are setting the sub, correct sub volume as the default B-DRIV-S sub volume. So, um, yeah, basically if you're just mounting B-DRIV-S with mount, for example, you will get that sub volume anyway. And if the bootloader is doing it in a similar way, like rub is doing, uh, then it will work out of the box. I have no idea about sdboot to be honest. Uh, I have not even a, uh, idea how it's B-DRIV-S integration works. So if it does need something like that. Okay. And the last question, uh, what is the criteria of successful update? A criteria for successful update is that super didn't return an error in our case now. So it's before the reboot. I mean, if you reboot with updated, uh, system and, uh, something goes wrong. Uh, yeah. Then we have health checker, which is trying to detect those, uh, cases which when the system did not come up correctly. For example, your database is not online or you may check for, uh, the, uh, conversion scripts now that I'm talking about databases. Uh, if you have, uh, a database migration, which would have to be done. You can also make a check and health checker, uh, and see if the database was migrated correctly. If it wasn't, health checker will automatically roll back to the previous date. If health checker doesn't boot or if B-DRIV-S itself has an error, then we have a problem. That's, those are the two components that mustn't break. And, uh, in a partial, uh, partial way, uh, the part of Crop, which is installed to the MBR, of course, or to the, uh, boot sector also mustn't break because, yeah, then we have nothing which can boot anymore. Uh, as, as a developer, is there a way that I can test my new packages without rebooting the machine? Like, can I, hey, I want to test my database in the new container beforehand? Because like, spending a couple minutes rebooting is really tough. Um, if you're using containers anyway, uh, you don't have to rebuild it at all. You just maintain your containers. Er, sorry, when, when I say container, I meant the, the like, the new system image snapshot. Um, you can use the transactional update shell, um, because you are basically inside your new system, but you won't have your Vasa volume mounted there. So, if you want to test something which depends on var, then no, you can't. If you don't, then you can use that environment for testing everything else you want. Yeah, hi. Uh, we have an embedded system and we have, yeah, something like this, but not on the file system or level, but there we have a network, uh, network API that you have to, to call from the outside. And if that's not called in a specific specified time, then you reboot the system. Do you have anything like that here? Um, do you mean, uh, you did, you did the update already and then boot into the new system and realize all the network isn't coming up anymore? Yeah, for some reason, the outside can't reach you anymore. Um, the question is, can you do that with health checker? I would say yes, because, uh, if you use, for example, uh, the network manager online, NM online, no. What's the command? You can use, uh, there's a command for checking if the network is online already, or you could just have a loop and checking for the, uh, for the network. If it doesn't come up, then you would have to return with a non-zero exit code and health checker would roll back the system. So, basically, you could just put any, anything which is scriptable, can be put in there. Um, or I mean, you could also write your own service or whatever, uh, you are using and yeah, just do the rollback manually or call, uh, rollback helper manually. Uh, sorry, transactional update rollback manually. I have another question about the health checker. What dependencies does it have? Is it agnostic enough to be able to use it on, on another system, on another atomic system? Yeah, but don't look at the source code, please. Uh, there's some really crude, crude text to get, uh, grub to, to be able to do that rollback mechanism. Um, that's why I was having a look at, um, SD boot from SystemD because that has support for, uh, that already, it doesn't need those hacks. Um, health checker itself is just a shell script. Okay. Or a collection of shell scripts for grub, for, uh, SystemD, and for, uh, yeah, basically that's it for the shell scripts in the, in it already. So a clarification, uh, when you're rebooting a URL, you have some migration to perform on var. Do you also create a snapshot there? No, um, everything which is done on var has to be taken care of the system administrator of the, uh, why not? Because let's say I have two migrations to perform on, let's say, two databases, the database system, one is successful, one fails. What happens then? We're not just having a separate snapshot for var so that you are able to rollback if something goes wrong at that phase. That's a good idea from, from that perspective. Yes, I agree, uh, but we are not doing that automatically because I don't think we can detect, uh, whether we need to do a rollback, uh, sorry, we need to do a snapshot because we don't know what will happen during a boot. We just know it is a new boot after system update, but we don't know what will be going to happen. I mean, we can just create, uh, a snapshot. Yeah, I mean, that's a good idea. We could just create a snapshot on boot and if health checker was successful, we could just delete that snapshot again or that's a good idea, actually. Anybody else interested in that? No, I have a different question. Um, so I have a different question, which is, um, the mechanism that you show is doing client side compose, like every single machine is downloading packages and applying them and snapshotting and all the stuff. Um, did you, there is another kind of like approach or way, which is, um, you do this kind of compose somewhere else, like remotely and then all the notes are just downloading, like premade, snapshot, let's say, did you experiment or try that? Uh, that's basically what OSTry on Fifidora site is doing, uh, but we don't like it at all to be honest because it's a completely new packaging system. You have to, uh, yeah, package all your stuff for OSTry, uh, with our system, we can just use the RPMs, which are there anyway. So, uh, yeah, uh, I think that's more, they're more flexible. Yeah, uh, we don't have time anymore, so thank you, uh, Ignace. Thanks a lot.