 Good morning My name is Stephen Gallagher as some of you probably know and Today I'm going to be talking about how do we reset fedora to factory defaults? Dealing with gremlins in the packaging guidelines so I'd like to tell this as a bit of a story Play in three acts act one the rise of the RPM package manager So once upon a time a group of brilliant a brilliant designers and engineers innovators developed a tool called RPM They built this as a pay as a system to create it to do both building and delivery of software onto a system and It became it was a very ambitious project. It did a lot of things and really It was that hammer Right there. It is a and of course When you have a hammer, how does that expression go again? Have a hammer. Oh, yeah, everything looks like a nail. So over the so as we progressed RPM grew a lot of additional features. It became more and more complicated and Like I said, it pretty much ended up like that hammer So once we once the RPM was developed we say we started to build red hat Linux and then later fedora Using RPMs. It made it made sense. It did a lot of cool things It had the it had the ability to put just about anything on the system that you wanted it had the ability to Configure it when to it finished installing it we'll get back to that and Generally, it did a lot that other packaging formats of the era really couldn't do and And as it grew and as we built red hat Linux and fedora Linux Things started to get more complicated. We started to add a few more packages We had we started to build an entire really build an entire distribution around this thing a lot of people were getting involved they were excited and Suddenly we had a whole we had divergence starting to happen like because because RPM is so powerful and can do things in So many different ways people were doing things in many different ways So we had to solve that problem So we built the fedora packaging committee and We set up a series of guidelines for how if for how you can do things one could even say rules We can't expect everyone to do exactly the same good job So let's take let's examine the set of packages We've got figure out which ones are doing things the right way and make them into a set of guidelines and then publish Those and then so people can ignore them and continue to do things their own way wait no So that we have we have a series of checks on these things So when new packages come in we verify that they are that they have been written in you know according to these guidelines and it was good For many many years fedora has actually actually built and released. This is this is a thing that happened Contrary to all realistic expectation We have actually built something that is good and that people like we should we want to maintain this level of excellence So what happened next next we started running into a few of the places where rpm was not Particularly designed for for one part Exponential expansion of the number of rpms in in fedora In fedora, we have grown by leaps and bounds as of this morning. There are 19,470 packages in the fedora project I'm going to repeat that number because it threw me back a little bit source There are 19,470 source packages in the fedora project today Yeah, binaries. I'm not even bothering to count Especially since half of them are tech life and more of them are being reviewed every day On average for the last four fedora releases. We have added 500 new packages to the project That's 500 new packages in the last two years and Amazingly enough thanks to all those guidelines we built most of them are high quality The ones that aren't a lot of a lot of the time are ones that have just essentially bit rotted since they were reviewed And there's there have been other talks and other process other proposals on how to deal with that bit rot but for the most part we've done pretty well considering the fact that This this this curve has just continued to grow at a faster rate every year. It's kind of ridiculous But it's still mostly working. So that's not terrible. But then then the next big thing happened. Oh Wait Now we have virtualization and while we were adapting to virtualization Containers arrived because we still hadn't we really hadn't completed that that that pivot into virtualization in fedora We were a little behind the curve on that we had we hadn't really figured out Entirely how to how to play in that space but the the the real big change there was that now we are thinking entirely of an operating system running on a piece of a single piece of bare iron We're talking about density in data centers We're talking about instead of the history where you would have a couple of really beefy machines that you would Care for and feed and make sure they were you know make sure they had everything they wanted Now all of a sudden you've got virtualization and you've got 300 disposable machines that are all doing the same task that if one of them is not as malfunction you yank it out of the rack Effectively or the virtual rack and you just and you throw up another one and containers Increases this even further because you may have You may have dozens of containers running on any on any given virtual host Now we have an entirely different problem. We have machines that we want to deploy fast We have machines that we don't really need to care so much about their life cycle because they're Disposable you want something you want to change their life cycle you put you put you pull up another one And you shut this one down and what we really need to do there is we need to focus on figuring out Okay, so how do we make reusable images? How do we create these virtual machines and these containers in such a way that we can rapidly throw a hundred of them out there all at once and The way people would do this with virtual machines is you know they'd stand up a VM on a desktop in VMWare or KVM or virtual box or what have you and One person would go through and they would install absolutely everything they needed They'd they'd run it they'd run their their tests effectively a private quay environment or a staging environment then they would Manually or later. We had it. We added some tools to do this go through and Rip out anything that it was necessary that was specific to that machine things like the host ID things like You know various places that had you you IDs various places that? Had created SSL certificates by automatically to talk to between services And they would have to go and rip these out Keeping notes on everything they did write scripts so that when they deployed the new machine They could put the all those things back in So we what we were basically doing is passing on all of that effort To the users we were giving them a whole operating system, but they didn't want a whole operating system They wanted a template for an operating system and pushing those pushing those things onto the user wasn't a particularly friendly thing to do and we One of the one of the examples I'd like to use a lot mostly because it comes up at least I think when will did his Research it came up at least 16 times in the set of packages in rel and I can only imagine how many packages it shows up in Fedora was the creation of SSL certificates on during in an rpm spec file Let me repeat that people are creating Self-sign certificates in rpm percent post That's not good for a variety of reasons and For one it's If you're generating if you're trying to generate a gold image for something like OS tree or create a key Just push out a Q cow for for a cloud image and things like that You're generating the rpm. What you're installing the rpm once on a provisioning compose machine And then its results just get put out to pushed out to every machine and now suddenly you've got 5,000 machines in a public cloud somewhere that all have the exact same SSL certificate that matches that doesn't match their host name That's not gonna work furthermore, it's additional effort as if any of you went to wills talk yet two days ago It's a whole bunch of additional file operations and and f-sinks and things in the process that just simply don't need to Be there and slow everything down So how do we deal with the problem? How do we make those things go away? First we have to we have to eradicate The scripts in rpm percent post we have and percent post on and ever and wherever and frankly If I would if I can be so bold, I think we actually should Release an a version of rpm that doesn't that stops parsing Scriptlets we take that out of the system entirely What we can do now that we have system D We have we have the ability to create new system D service units and mark them as required in order to start some start other known services, so I'll take the Apache example or the cockpit example where they need to generate as an SSL certificate instead of Generating the cell certificate in their percent post though. Those are bad examples because both of those have been fixed But instead of generating an RPS and SSL certificate when you install the package We instead drop a system D snippet into into into the system so that when the hdb service is activated or either bike by socket activation or service activation it first invokes a Systemd unit file That will check to see hey those files already exist if they do I Succeed and you get on with your life if they do not you go through a script and generate them for the first time This means that we can push off the creation of any of those things that are happening in RPM Scriptlets to either the first boot or the first boot after you wipe all the after you run vert sysprep We just we pull we remove those things from the disk and we do a system D unit file that does condition path exists not You user Etsy my my my service as a cell dot cert So then we generate so we can generate those at first boot or on Subsequent restarts of the service if you forever have any reason you need to do that and they're out of the composed process They're in if they're in a space in the installed system where they can where they can be constrained by something like SELinux so we cannot so we can do a lot more protection of the end user system Because one of the things that a lot of customers and users of fedora and rel are concerned about with RPM Scriptlets is they basically especially when they're running on anaconda. They have no restrictions They they operate as root and you have to trust that somebody has been keeping an eye on what has gone into those scriptlets If you if we move those into the system with SELinux protections We can ensure that know that even if somebody managed managed to slip a little something extra into that into that SELinux is probably gonna step in and say and no no, you're You know your your video game does not have permission to have permission to read and write the add to the Apache Directory and things like that So we get a look we get more security out of this we avoid we avoid issues with with container and VM generation and Generally and generally just improve the situation all around so What we need to do next is analyze all of the packages in fedora and find out which ones need to be adapted to use this new approach As I mentioned will has done a really good job of doing going through this on rel I would like to believe that he is also planning to do this on fedora But if he's not I would like to get his scripts and get to work on it as well Yes, we should absolutely team up. Well, thank you Will will would very much prefer to team up because as he phrases it his liver would not survive it and I can see how that could be a case We do expect in the next few weeks to a couple of months to have a pretty clear picture of what the affected packages I will be and I will probably go through the fedora mass bug filing process Ideally with submitting patches submitting patches So that should that should allow us to fix those fairly easily so In summary Keep your rpms out of the light Do not get your rpms wet Do not feed your rpms after midnight. I ask I cannot Please stop working to stop working on rpms when you're tired That one that one is not only partially a joke stop working on rpms when you're tired. That is how bit rot starts I cannot As a bit of an anecdote I was going through some package the other day because That I was working with and there were actually code there were actually comments rather in the spec file That I would I had to go back and get history to find out why it was there because it just said I removed this This section because I realized I had I realized that when I had written it I was too drunk and too tired to have made any sense out of it That comment remained But not the context and that was Yeah, and when I did go back and look at it Yes, this person was definitely too drunk or and too tired to have written that but in all seriousness the package guidelines have been written and updated to support this if you have any packages that work that Haven't do any kind of system specific initialization in a Scriptlet, please Review the review the document here, which describes exactly how to write that service unit that I was just I was talking about in many cases exhaustive detail So it really should be an easy task and if you can get to that before we get to you we get to you I will have a gold star for you I don't have any right now, and I don't actually expect to give anyone out But if I but I will go and buy them and and mail one to you if if you beat us to this I Rattled through that a bit faster than I had intended so I've got a full ten minutes for questions Sorry Okay, so So the question is is there an impact on startup time to have these new units? And whether or not are they removed after they fired or not? So first of all no they will not be removed after they fire because the whole point is that we need we do need to be able to tolerate Removal of the of the files that affected that are affected because you may be in the process of generating an image Wipe them copy the image around and then we want to make sure they continue to run the startup impact when using the the system D conditionals is It has not been measured at scale But on a few packages it has been measured in the nanoseconds of additional time not the not even the microseconds Because the system D's conditional checks are pretty efficient and if that became an it honestly That would be the optimization point if we needed to it wouldn't be it would be more valuable than removing the units further questions Like I'm sorry Okay Okay, so the question is Do we have to tell people while they're creating the gold masters not to create not to start these services? Before they create the master and the answer is no they can and should because they're gonna need to test that things work what we have is We have tools like vert sysprep whose job it is to go to know which things are potentially Potentially system specific and to wipe them before So you would do vert sysprep shut the machine down clone the image We need we need to make sure that we coordinate with the vert sysprep people and if there's if there are other tools like that out there I mean, that's the best supported But if there are the tools out there like that we need to have documentation says here are the list of known things that We want to be able to remove. I had a conversation with will yesterday where we may want to have that We may want to find a way to make language in the s are in the RPM metadata that allows us to interrogate that rather than have to maintain a separate list And that and that's that's an opportunity for enhancement here Right now Vert sysprep is pretty good It knows an awful lot of these things including a whole bunch that aren't even That are in proprietary third-party code out there, you know common common customer applications and things like that so We're in pretty good shape for that right now and we Until we have that kind of introspection data We'd probably keep maintaining though the whitelist and have to volunteer volunteer as part of this process Does that answer your questions? Okay? You still look like you were about to ask another so Right so as far as the question was in terms of creating a gold master. What other settings can it could interfere? And and you know what if you wrote certain settings in your kickstart and whatnot We we probably can't solve that perfectly if you're You shouldn't the reality is you probably shouldn't be doing that in kickstart if you can help it unless you're you shouldn't be doing anything in kickstart that isn't You don't expect to be usable for all deployment all parts of your deployment if it's and if it's system specific It should be done either if it's auto-generated should we done with this mechanism if it's not auto-generated It should probably be put in by configuration management software not by We don't really need to solve that that's a solved problem So I would just say that our statement is Don't do that in kickstart. It's gonna it's going to bite you. I Think I think people figured that one out on their own now for the most part Mike again Are we the question is are we trying to get rid of rpm? Scriplets entirely or just the system specific stuff The purpose of this specific talk is to get rid of the system specific talk stuff This is one step towards the ed towards a future in which we can eliminate scrub a scriptless entirely because really They are a technology that was Powerful and useful when they were invented and have introduced so many problems that they that we need to find us a Migration strategy away from them if that wasn't if that yes, if that wasn't clear the answer is yes We should get rid of scriptlets entirely. We do not need to do that as a single action, but we need to we need to migrate each Again will talk earlier in the week and I said if you haven't seen it I suggest you can watch the recording once it's available Talks a bit more about how to eliminate some of the others I he found six specific things that covered 99% of all script use a script usage and At least five of them would be really easy not to do in scriptlets The last one the miscellaneous category is a bit more effort. I don't Actually, I don't think that would be that hard. I think all of those are possible to pull out of the script that's relatively easily and Relatively in the couple you know the next few fedora releases scheduled not like the next few epochs So will is will is reminding me to make make it clear that things like caches and catalogs are Complicated and will probably require a great deal of thought and I completely agree. That's part of why it's not part of this talk But no, I am firmly in the camp of rpm scriptlets need to need to go the way of Let's say Linux. Let's say Unix. Sure. Let's see in it dot-D. We'll go with that Our other distributions trying to get rid of their rpm scriptlets. I don't have at my fingertips that information Debin at least seems to be still pretty Bent on maintaining debconf for whatever reason I can't speak to the other distributions. I I suspect that the core OSes and the You know the other minimized container for derived distros are probably looking at this as well As a counterpoint gentu's entire package management system is a collection of scripts. I am pretty sure I Don't care. I don't see I don't really see what What the other distributions are doing is less interesting than whether or not They are coming up with clever ways to do the things I'm suggesting if we can if some of the you know I will keep my eyes open, but What little bit what little I've seen of this? Like Mandrake for example has basically just copied this This page into their guidelines So I think they're I think the other distributions are looking to us To solve this and we'll probably adopt it once we've proven that it doesn't break everything Which is pretty much par for the Fedora course actually Got three more minutes So at the rate we're currently going I've had time for one more question or else I can let you go to lunch three minutes early Thank you all for participating