 Welcome to the Palsum 2017 Distribution State Dev Room. We have Thorsten Kukuk here talking to us about transactional updates with ButterFs. Okay, thank you. So, about two years ago, there was an article in a German newspaper. I think that most of you don't understand German. It's a rough translation. The scoreboard did not work. And that was the reason why a basketball team was being relegated to a lower division due to a Windows update. So, what the talent was thought before the med food start. They had problems with their Windows PC who was driving the scoreboard and taking the time. So, they decided to reboot the machine. And when the machine would come up, all the Windows updates were applied. The game started about 20 minutes late. They won the game, but on the green table later, it was decided to lose it. They got Malo spawns and so they were relegated to a lower division. And when I read this, it was clear for me. I never want to see that a Linux system or update was the fault that a team is relegated or something else. Somebody else has problems because of it. So, why did I force that? What were the problems I'm facing? Why did I look at updates at all? Normally, if you have enterprise customers, they're always telling you most important thing is the machine has to be up. Uptime of 99.99999% are still not good enough. The best is never reboot the machine. But also, other customers who are saying, if I do an update and my application is interrupted during the update, it's much more worse to me than if I now have to do a scheduled reboot at a fixed point of time. On imaging, you have a cluster of web servers doing some web folks running, and your customer want to press the send, buy button, and then the web server has restarted. That's bad for them. So, Android has something similar. If you update on Android an application, an Africa package, and you are currently using it, then it's missing. So, it's gone on it. So, that's something customers don't want, that their customers have problems. So, the question was coming up. If you couldn't apply the updates in a way that the servers are not interrupted, and they know that at some point in time they have to reboot the machine to activate the changes. And the update should be either fully applied or not finished at all. So, if your update fails for some reason, the post install script doesn't work whatever. There shouldn't be an undefined state of the system, like one RPM is installed twice in two different versions, whatever. It should always be a defined, clear state. So, when I got this requirements, I also got from our open-soothed folk the requirements. Can we not use that for open-soothed too, tumbleweed? What are the problems with rolling distributions? If you listen to Richard this morning, his presentation, open-soothed spends a lot of time in testing with OpenQI and so on, and make sure that updates itself are fine and working. If you follow the open-toothed factory list, you will see if you log out, log in as a root on the console, apply the patches, log out, log in again as a user, everything is fine. But most of the time today people click on a button, the update is applied, and now imagine you are using your desktop and a new major version is released at the same time and it's updated. Most likely your running desktop will not survive the update and craft, and you have a system in an undefined state. If you have luck, you could start the update a second time and it's finished and everything works. If you have bad luck, then you need to reinstall or do a rollback of it either way. So, you need a way to apply intrusive updates without any impact for the user who is currently using the system. So, that's why we came up with transactional updates. The definition is more or less the same as debian is using for, now Ubuntu is using for their transactional update wording. So, transactional update is something that should be atomic, so either it's fully applied or it's not applied at all. And the update would not influence your running system. So, if I install a new aperture or database, then the currently running database would continue. Only think about the upgrade from Postgres 7 to 8. If doing this update, the database is converted. There's a long time where you cannot access your database and that's all in some scenarios you don't want at all. And of course, if something goes wrong, if you got a new kernel and the new kernel does not work with your old hardware, whatever, you want to have an easy way to rollback to the old state and not need to rescue this or whatever. So, I know there are already solutions. Some of the solutions work with different partitions. So, where you update one partition and then jump to the next one with the next boot. Other use special package formats. We have OS 3, but all the solutions have one made for drawback. You need a redesign of your system or your tools, especially if you have a new package format. But, if I look at our openSUSE infrastructure or SUSE infrastructure for the enterprise customers or our customers, they all know how RPMs are working. They have their infrastructure to build RPMs, to verify RPMs. So, I was looking for something that we can stay with what we have already. And in the end, it's quite easy if you have a current openSUSE or just working. All you need is butterfs with snapshots and rollback enabled by default. You need snap parts to manage the snapshots. You need certain parts to update the system. And of course, butterfs utilities to make some modifications to the file system. If you don't want to use RPMs and these tools, of course, you can also use other package managers by adjusting the script. And it's really generic and it works with everything which can do snapshots and rollback. But in this case, I try to use what our distributions provide. Who of you know how snapshots and rollback are working, either with bitterfs or something similar? Okay, not much. Okay, so, in the end, you have always one root file system. The file system should only contain your applications. It should not contain data. If it contains data and you make a rollback later because you broke something, the changes of the data goes lost. Coming back to a webfob, if you have a customer entering a big order, you are doing a rollback and the order is gone from your database. That's why you should always separate applications and data if you use snapshots and rollback. That's independent if you use butterfs, LVM, OS3 or whatever. It's a general rule there. Every time you make a modification to your root file system, you create at first a new snapshot. Over the time, you have one root, you have a lot of snapshots. Of course, there are tools to clean up all snapshots you don't need anymore. And so now you want to update your system. At first, we create a new snapshot. And then we run zipper up in the current root, which means the difference between this and this snapshot is only what zipper up did modify. So it's not a one-to-one copy, so you don't waste a lot of this space for all these copies. But the differences is only really always what was modified between the snapshots. That's a nice effect. If you want to know what your update did change on this, you can run a diff between two snapshots. And a snapper diff will list you exactly the differences between these two snapshots. And so you can see what was really changed during the update. But now, why always only creating or updating the current root and having all these snapshots there? I hopefully never need only for a rollback inverse case. Why not doing it the other way? I have a snapshot. I create a new snapshot and so on until I have my current root. Now, I create a new snapshot of my current root file system. It's a read-only clone by default with buttoff s. I need to change it read-write. Quite simple with buttoff s tools. I run zipper up in this French root environment. You can call zipper with def big argument and give it a pass to the snapshot. So only the data inside of the snapshot will be updated. I change it back read-only. And now I'm doing a rollback as I would normally do in a Kawaii case. Which means I still have my current system. I still have a snapshot with all the changes. And with the next reboot, this is my new root file system. If something is not working, I can return back to my old root file system. If everything is working over the time, I can delete the old snapshot because I don't need them anymore. In the end, it's the same as doing snapshots and rollback. Only that I change the order. I'm not modifying the current system, but the new snapshot. How to do the transaction updates in the script? It's quite simple. With snapper, I create a new snapshot. Save the ID. I set it with buttoff s to read-only to fault that I can write in it. Call zipper, tell him where he can find his snapshot to update. Make a snapper rollback so that the new snapshot will be active and taken. And at a good time where it fits, I reboot. And I'm done after the reboot. I'm having my patches applied and have still the old snapshot. So this all sounds great. But every time you have something which is helpful, you also have drawbacks of it. There are some requisites you need to look at, take care of. These requisites are not only for super buttoff s. If you look at the documentation for S3, for example, or for distributions, we're using several partitions. They all have the same requirements that it works. More or less defined, good documented. Something I learned with this is the best is really to having a read-only root file system because then it's pretty clear what a snapshot is. You cannot make modifications after the snapshot because if you have a read-write file system and create a snapshot and you do modifications on your old root file system with the next reboot into the new snapshot, all changes are, of course, gone. So you need to reapply them again. This is the reason why you couldn't either use a read-only file system or do this reboot pretty fast after you made the update. So the most important one is strictly separate data from applications in your file system or sub-problems. SLEV-SRV is in this regard a real nightmare because you have PHP, Tom cut user data, base is everything sometimes even in the same directory and either you do a rollback of everything or nothing which means if your PHP update or Tom cut update is broken and you do a rollback, your database, your data is also a rollback. That's a real nightmare for people having to create snapshots and rollback. Also, you need to take care what RPMs are doing in their pre-post install sections. If you have a new package format special designed for transaction updates or snapshots they solve it quite simple. They don't allow you to have pre-post install sections where you can modify something. If you continue to use your old RPMs over the time you should audit the pre-post install sections and make sure that they don't do stupid things in this regard. So don't modify data. Most of the time you don't have access to them anyways. So you should not convert the data during the update but convert your database at the first reboot or if the application, the new one is started or something similar. When you are sure this application is now the new one running and be used and not another older version is still accessing the database. And then you should not create, of course, directories files that are outside of your root file system or the default snapshot sub-volume. Don't fiddle around with running processes. If you try to restart yourself and you are updating a snapshot then afterwards the old process will be started again and not the new one because it's still the one in the default search path and you should be able to scope with different data formats. This is, especially for desktop, a big problem. If you are in a network and your home directory is shared via NFS and on one machine you have an old version of your most likely desktop on a new machine you have a new version and you switch between the two machines with the same NFS home directory I'm pretty sure you saw more than once that after you logged in on the new machine and go back to the old one your desktop looks pretty broken because your desktop did convert all data at the first login and then the old version cannot be scoped with it anymore and then there are some directories for snapshots you would normally exclude like Waspool because you don't want to lose your email and log it's always good to have the log file to the old one This is already in use, yes? We use it already in some of our products The Zuse container as a service platform is a new product we announced last year and currently it's in development It's the only way to do updates on this product because it is using a read-only root file system so you cannot use traditional zipper app it's read-only, it will not work it will be a rolling release so we will see big intrusive updates and we want to avoid that the system will break during the update so the transaction updates we found as a loop to update the system in the safe way being able to go forward and backward then we have opened the tumbleweed if you looked, I think two days ago how many RPMs we updated in one night I think it was 367 for the default installation only, I don't know what else was updated it has things like system de-update, sum update KDE5 updates it contained doing it in a running system, this update even if you tested everything it's a little bit tricky and critical with the transaction updates you can update the snap fort your current system will continue to run afterwards after reboot everything is active if it's not working you can go back to the old the script for it itself is now part of tumbleweed signed some weeks but on some test systems it's running over three months without any problems the old tumbleweed updates during the last three months were updated, the transaction updates on the machines they booted automatically when they boot so if there was an update and the boot window there reached then they rebooted themself and there was no problem, no error message nothing, it worked so the code for it, the script is in GitHub and I have now questions about this I have questions the first question at the beginning of the talk you said that one of the advantages of transaction updates in the file system was that it would work with existing package manager and existing software a bit later you said that using the system puts constraints on the package manager which should not do certain things how do you can solve it? so if you look at the tumbleweed for example the default packages are fine they are working if you look at special packages maybe you need to adjust them but you can start already today you don't need to convert 6,000 packages overnight to be able to start but you can start with the default set I don't know how many this are currently for tumbleweed how many are installed in the default but the default installation is fine and over the time more and more packages will be adjusted but if a package makes trouble it does not break your system so if there is an RPM who is doing bad things in their pre-post install the script will say ok something is not correct update did not work delete the snapshot, not reboot and your system is still working and then you can go and fix this package ok so you mean you can make the change in an incremental way changing your whole package manager all the same at the time yes here is my second question how does ButterFS and the specific features of ButterFS interact with the POSIX API for the file system I guess that the transactional primitives are not POSIX so how does that work do you need special calls do you need nothing no, you need nothing of that so ButterFS is a copy on write file system which means you have one file you make a modification to the file the ButterFS is creating for you a second file and the old one is still there if there is a reference from an old snapfort the old file will not be deleted but it is visible in the old snapfort if you have no reference anymore then this old file is deleted so it is completely behind the file system invisible to POSIX and the applications so if you want to do snapforts you need special kernel EOCTL to call them of course they are not POSIX conform because POSIX does not know it but else all file system operations are POSIX conform so it's all normal file system operations except for IOCTL or something like that there is an IOCTL which you could tell ButterFS that this subvolume should now be cloned or that it should be now read only, read wide or that this subvolume should now be the new root file system with the next reboot so these are the things you could set with this IOCTL Richard, you were next I just wanted to add to the question about changing packages to conform to this but coincidentally the requirements of course have listed there pretty much identically matched with practice guidelines which I'll consider has had for a really long time so in our case if it's not working with this it could have broken our own guidelines so where they were guidelines they have them in a lot of corners and this requirements are not only valid for transaction updates if you really want to take advantage of snapforts and Roybeck you should follow them too so more questions, yes I think two minutes more if not, ah, yes the prevents the packages from actually messing with running processes by using namespace or something like that how are you, quick if an update script would attempt to restart a process, would it fail? so it depends so system D for example always tries to connect and restart their scripts and fails because the sockets system D is using for communication are not available in this environment else all our at least open zoos the RPMs if the RPMs is using the macro to restart the daemon the macro detects that it is in a situation where it should not restart the daemon and does not even try to restart the daemon it really depends on how this restart is implemented in the post install if it succeeds or not there are some scripts who are looking in this PS in the process list and kills everything which has the same name as itself and start itself new this is something you cannot prevent if it is doing it from the transaction update script if it is using our standard open-zoos macros for restarting a daemon it works, it will not influence the running processes because the macro itself has already signs I don't know how many years support for it does it answer your question? ok, more questions I think we have half a minute yes? so I cannot say much about containers but you have an operating system running the containers and for this you need some kind of this so the container as a service platform is exactly such an operating system running containers and you need to update the core system and for this we introduced this as I said I cannot say much about containers it's not my word yes, Wenz? this is always an interesting question can I combine it with live patching? you can only combine it with live patching if you can do live patching to all running processes and libraries so live patching is to avoid reboot transaction updates requires a reboot so it's conflicting it doesn't fit together it doesn't make much sense so I think time is over already so then I want to thank you for coming