 with an omelette over the oh. So give him a big hand and welcome him in for his talk. Okay, hi. So today, can you actually hear me? So today I'm going to speak about SystemD. We always prefer interactive talks over talks where I just talk and you listen. So you're very welcome and encouraged to interrupt me if you have questions and I would much prefer you if we could lead this talk into the direction that you wanted to go instead of just me finding things that I can talk about because I can talk about it for hours and I would always find it interesting but it's it's the importance that you find it interesting. So let's jump right in. So this is the first half sentence of the description of what SystemD is that I copied from the SystemD website. It's a long paragraph and we'll try to pass what it actually says there. So the first part of it is kind of easy to understand that is SystemD is a system and session manager for Linux. I think everybody has a has a bit of an idea what that could mean. System manager meaning something that manages system and session manager meaning something that manages login sessions. It is compatible with SystemD and LSB in its scripts. I guess everybody has seen LSB and SystemD in its scripts like as least if you ever have administered the Linux system you probably have come in contact with them. It provides aggressive parallelization capabilities. That basically means that yeah it's very good at paralyzing Buddha. It uses socket and ebus activation. Well what that means is not so easy to understand I guess without a little bit of background and we hopefully will discuss this a little bit later what that here precisely means. It was on demand starting of Demons. I guess some of you might have an idea what that could mean. It's actually pretty simple but we start Demons exactly the moment where we need them instead of already on Buddha. It keeps tracks of processes using Linux C groups. If you know what a Linux C group is you might actually understand that too. If you don't then let's hope that we can also cover that. Later on the talk it supports snapshotting and restoring a system state. That's an interesting feature and probably if you have been doing databases or something similar you probably let's get rid of that. Similar ideas that you can actually save the system state and return later to it and maintains mount and auto mount points. I think mount points everybody knows what that is. An auto mount point is something similar but the actual mounting of the mount point is delayed to the moment where it is accessed. It implements an elaborate transactional dependency based service control logic. That's a half sentence that's really difficult to understand I guess. Transactional most people probably have heard in the context of databases. Dependency based people probably have heard in the context of package managers and service control logic basically means how to control a service. It's kind of difficult to understand I guess but I hope we'll shed some light on this later on. It can work as drop-in replacement for system V in it. System V in it is a classical system V implementation of in it. In it being the first process that is started by the kernel when the machine goes up. So this is the really short description of what system D is. That is we put up on the website and my talk in the following is basically just trying to figure out what does that actually mean and shed some light on all the details about that. Yeah about the LSB in its scripts. Which distribution is used in? Because like on Debian I've started on Wintour. Are those the same? Everybody basically uses them. There are very few distributions that do not use system V. I think Gentoo uses something that is related to system V in its scripts but it's not actually... No I didn't mean system V. Those have been there forever. I mean the LSB ones. Oh LSB and system V is basically used synonymously. System V introduced the concept and then LSB standardized it and extended it a little bit, added a comment section to the top to define what exit codes should be done. So I tend to use it synonymously and most people should probably do. I think all distributions which have adopted system V in its scripts have also adopted LSB semantics to them. All right thanks. Okay so the first slide is about init 8. Init is the first process that is started when the system goes up. So the kernel boots initializes everything and then it forks off the first process. It has a PID1 and that is init. And system V is one implementation of a process that could be run as process one. Process one is magic. Process one is magic in various different ways. For example every process that is run on the system is a child of process one if it's not a child of something else. If a father of a process dies it becomes reassigned to process one. It also has some other magic things. If PID1 dies, the machine dies. The kernel oopsies in the end of story. If process if you press control I'll delete then the kernel will inform process number one. And there are a couple of other things where process number one is special and has special powers over there over all of the system. It's basically the place where the system is maintained, the user space is maintained. So let's jump right into the features that system V offers you. One of the of the amazing things that we did in system D is provide parallelization in much more detail and to a much higher level than any previous solutions did in Linux. So what does that actually mean? So we have these nice nice graphics here that hopefully help us to explain a little bit or understand a little bit what parallelization means in the context of system D and why it goes a couple of steps further than the previous solutions in Linux did. So let's have a look on traditional system V or LSB booting. Until Fedora 14, this is how Fedora booted and at 15 since system D is being adopted this will all change. But so we have Avahi and Bluetooth. Avahi and Bluetooth, Bluetooth being the blue Z-Demon, both need the D-Bus-Demon. So Avahi and Bluetooth need both to be started after D-Bus is. D-Bus itself uses this lock. So it needs to be started after this lock. Avahi and Bluetooth also use this lock. So they also need to be run after this lock. So to the effect that on Fedora 14, the activation like the order in which these things are started is basically this one. This lock first, D-Bus then, then Avahi, then Bluetooth. Of course, there's no actual dependencies between Avahi and Bluetooth. But since the classic system V does not paralyze anything, everything is started linearly and that basically means we just chose alphabetical order in this case and picked Avahi first and Bluetooth second. We could have done it the other way around too, but yeah. So then people looked at that and noticed, oh my god, Avahi and Bluetooth, we started one after the other. That will delay our boot. Wouldn't it be awesome if you could paralyze that? And they did the first step towards paralyzation and they came up with this. SUSE implemented this way, Ubuntu implemented this way, a couple of the smaller distributions too. So what they did basically is D-Bus still requires this lock. So we need to start D-Bus after this lock and Avahi and Bluetooth still require both D-Bus and this lock. So they need to be started after those two services. But since there's no dependency between Avahi and Bluetooth, they are started at the same time. It's an improvement. I mean the overall time from booting, which is like where the green thing up there is to the point where everything is booted up is a little bit shorter, but there's not too much paralyzation. In system D, we go one step further. Now we suddenly start everything in parallel. And that is kind of impressive because how can that be? If Avahi and Bluetooth use D-Bus and SysLock, how can we actually manage to start them parallel? And that is one of the really interesting features we have. We actually didn't come up with that feature. It's actually something that Apple came up with in something called LaunchD, which they ship as a core part of Apple MacOS, and which is actually quite a great engineering product. So what they basically did is they looked at what is it really that makes Avahi require D-Bus? Why exactly is it that we need to start D-Bus first and start Avahi second? Why can't we start this in parallel? And they looked at that and then they found out it's, it basically depends on the time to sockets that Avahi uses to communicate with D-Bus are established by D-Bus. And then they thought, okay, if it's just about the sockets, just about the fact that the D-Bus socket has, that the D-Bus process has bound the socket called var run D-Bus system socket, if it really just matters to delay the startup of Avahi to this point, can we do something about it? And maybe A, start Avahi in that very moment where this binding is actually complete and not wait any further until D-Bus has done all its remaining in the initialization. And B, maybe could we even move this binding of the socket out of the D-Bus D-Bus and do it in one step further, one step earlier. And that is the implementation and that's the idea that that launch D uses. It basically looks at all the diamonds of the system, rips out all the socket binding from all of those diamonds and does it on the system level in launch D itself in one big step. So basically, if you boot up a system with launch D, one of the first steps is the does. It binds every single socket, every single communication socket there is on the system, be it the syslog socket, be it the D-Bus socket, be it anything else. And then it goes on and in one big step starts everything in parallel that is on the system there. So this is really interesting because suddenly you can parallelize everything. You only start processes in one tight in a loop and then everything just starts and it will use the maximum of the CPU available and be at the quickest start it up. It also has a couple of other advantages because you can actually, because all the sockets are already there when the first user code actually is executed, when the first Devon code is executed, you get rid of any kind of explicit configuration of dependencies. Because suddenly there's no need to tell the inner system in any way that Avahi requires D-Bus. You don't have to tell the inner system that Avahi requires syslog because what Avahi can do is it can just access the socket. And if it's there, and it will be there because it got initialized early in the beginning by the inner system itself, it can connect to that. And it also has a couple of other advantages. For example, Bluetooth is not actually needed in many of the cases because I don't know because you are in flight mode and have your Bluetooth hardware disabled or you don't even have even Bluetooth or stuff like that. So you can actually also do use this kind of socket activation, which is the name by which this all goes, to on-demand start services. You basically just install all the sockets and instead of then going on and also starting all the diamonds at the same time, you just don't do it. You just leave it out and don't start the diamonds. But the moment somebody actually connects to your socket, you then go start the actual diamonds and the client won't even notice. Because that is a really nice thing. As soon as all the sockets are established, all the clients can just go and connect to them. And the sockets will be there. And if there's nothing behind it because the diamond wasn't started yet, then the clients won't notice that. But the inner system will then start the services. It will take a little bit longer, but it's not visible to the client. Or if the providing service is still in the process of booting up, then it will delay until that is finished. But still, it's not visible to the client. You have a question. So the question was, whether all the services need to be modified to make this work? The answer is yes and no. The answer is, in the general case, yes. Because basically what you need to do is you need to look at the code that actually installs the sockets, which is basically calling the socket system call, then calling the bind system call, then calling the listen system call. You usually would rip these three things out. There's a very, very simple interface how systemD and those diamonds can communicate to just they get the sockets passed and they basically don't need to do anything. The code actually becomes much simpler if they do that. In that case, you would patch the diamond. We did that, for example, for RSS log. We did that for D-Bus. However, in some cases, you don't actually need to do patching. And the reason why you don't need to do bad patching is that this idea is not completely new on Linux, because there's the INIT-D diamond has been a part of UNIX since time began, basically, since 30 years ago. INIT-D was one of those classic diamonds that were a part of Linux, which basically started a service, the moment the first connection came in, then spawned one instance for the connection of a specific service, for example, for SSHD or back then, probably more telnet D than SSHD. But so basically, it would end up with one instance per connection. And this is already there. So there are two big differences between INIT-D and this kind of socket activation. The first one, INIT-D was almost always used in the way that you had one instance of the service of the diamond per connection. And this is more designed so that when the first connection comes in, you actually hand over the real socket, the listening socket instead of the connection socket, so that all further connections are handled by the diamond itself, so that you only have one diamond for all of the sockets instead of one diamond each for each connection. And the other thing that INIT-D did differently is that INIT-D was focused to only do stuff on demand. While we not only do things on demand, we can do that too, but it's actually not that interesting, we do it mostly to be able to paralyze stuff and start things in parallel. So, to coming back to your question, some diamonds you should be patching. You need to patch all of those basically where you want to have one diamond dealing with all the connections. But other diamonds, like SSHD for example, you don't need to patch because they already support INIT-D socket activation basically, and that works with system D2. Also, MACOS has been using the same scheme since ages, okay, ages like, I don't know, since MACOS 10.2 or something like that. So, most of the code is already there. It's not compatible. You need to make minor modifications, but quite a bit of the UNIX software that is established has support for launch D and then it just takes a couple of changes to make that work with system D as well. So, if you have code which is proprietary, you can't retrofit it and it was never written for use with INIT-D. Could you pre-wrap it with LD preload or similar to make it work with system D? Well, we initially thought about, or actually we spent quite some time on, instead of doing this in user space and patching all the applications, have somehow the kernel handover the sockets for us. So, basically, that we would create the sockets, we would call them ghost sockets or something like that, and then when the actual damage starts up, it would just take those already existing sockets and work on them so that the applications would not need to be patched. We investigated it in much detail, but it's not impossible to do, but it's actually really, really complex to do because sockets in these days have a gazillion of sock options. They can be members of multicast groups and stuff like that. So, basically, you would, in the moment where the damage starts up, suddenly exchanged one socket by the other and because at the time where the socket is created by the application, you don't really know what kind of socket is it going to be, to which port is it going to bind. You have this problem that in the moment where you actually bind in the application, you would need to find the right socket and then exchange those two or copy over all the sock options. It's madness. It's doable. There's no doubt about that, but we then saw this, okay, given that the applications are real, the damage are really easy to patch. We go with the patching. It's more urgent anyway right now. At this point in time, quite a few damage are actually patched. As mentioned, for example, Divas is patched, Zyslog is patched, Avahi is patched, Bluetooth is not patched, but doesn't really need much patching. There's even Dovecott, even IMAP servers are nowadays patched. Of course, not everything is patched. There's still a lot of stuff to go, but since we announced Zyslog actually quite a big amount of code has been patched. Do you support having a demon sort of die if it's idle for a bit or if it, you know, core dumps or something? Will Zyslog notice the death and then restart on demand? Yes. That's actually one really amazing feature of the socket activation, because Zyslog is now socket activated. So it's Zyslog which creates a socket and then listens on it and then eventually Zyslog starts up. Now if Zyslog crashes, for some weird reason, I mean Zyslog nowadays tend to be really complex pieces of code, so they have every reason to crash. If they crash, then they will go away, system we will notice that. If the RZyslog or whatever Zyslog service has been configured for it, it will automatically be restarted, gets the original socket again and will continue processing the incoming log messages from exactly the spot where it left off, of course minus the one message where Zyslog actually was crashing on. So you have suddenly really robust system where you can take away Zyslog, you can just kill it and you can put it back in. It will not notice, not a single message will be lost and that makes it really nice and robust system because suddenly so you can even do restarting of diamonds, you can do upgrading of diamonds like you install, I don't know, your web server, then you upgrade your web server, you should shut the old one down, you start the new one because a socket listening is actually done by SystemD and SystemD will always retain a copy of the duplicate of this original socket. You can restart everything and it will just work and the user will not notice it and there will not be a single moment where it's visible to the user that the diamond went away and came back. This all works for INIT sockets, but the focus is definitely UNIX sockets, which is also a big difference by the way to INIT because classic INIT, given already by the name, was focusing on internet sockets, not so much on AFUNIX sockets. But for example Zyslog in the D-Bus case, the focus is clearly on AFUNIX. One of the nicest things about this is, as already mentioned, all this dependency configuration, that classic parallelization system like system V, the way Zuzo was using it, Ubuntu uses it, classically you had to declare all the ordering where you executed stuff. The nice thing is that for SystemD, the kernel will order the execution for us. Because if you look at it, for example Zyslog, in a classic way when you first have to wait until Zyslog is up and then you start D-Bus because D-Bus uses Zyslog. In the design when we use this completely parallelized stuff where the socket is already established by the INIT system, then basically if D-Bus wants to lock something at a point in time where Zyslog isn't up yet, it will just write that message to the Zyslog socket. It will just be queued by the kernel. It will just hang there and it doesn't really matter. And eventually when Zyslog then actually has caught up with execution, it will go to the socket, actually pull all the stuff that was buffered in there and process it, but this thing is completely asynchronous. The D-Bus demo never has to wait for it, except when this socket buffer actually runs over. And then it will freeze until Zyslog has caught up. So the idea in this case is there's no need to order execution or anything anymore from user space because the kernel will do it. The kernel will simply do it by making sure that a client which locks too much will eventually freeze until the socket buffer is cleared again from the other side. Of course Zyslog is a very simple case in this case because Zyslog is one-way communication. The messages go from the applications from the demons to the Zyslog demons, but never in the other way around. If you look at D-Bus this is of course more complex because suddenly you have a two-way communication. When somebody connects to D-Bus he needs to authenticate himself and stuff like that, so there's always a force in the back. But still in this case when you use this kind of complete parallelization with socket activation then you wait exactly as much as you really need to do because you basically send your stuff to the socket and then only this thread of the demon if it's multi-threaded and really you need to wait just for the reply of this one thing and this one thing will wait exactly as long as it's necessary as long as the D-Bus demon has used to start up. Anyway it's a really nice design. There's a question, the mic. Don't you run the risk of having it deadlock? Is there an increased risk of that or not? The question was whether there's a risk of running into deadlocks. Well there is the same amount of risk of running into deadlocks as there always is because we don't really change anything. You don't have to configure the dependencies anymore but the dependencies in the end are still there. I mean they're just practically implemented. So if your current system doesn't have deadlocks, if your current system, if there's no dependency for example between this lock and D-Bus and D-Bus and this lock then it won't have any in this scheme neither. But if it has then yeah basically this design has no influence whatsoever on the dependency system if there are cyclic dependency loops or something like that. If they had that before which usually meant a deadlock and they will have that with this scheme and it will still mean some kind of deadlock. So does it kind of answer your question? So this is socket activation. As mentioned, MacOS, the Apple engineers came up with that and implemented them in LaunchD. LaunchD is actually really nice from all the ideas they implemented in that and I think it's one of the most capable systems that existed. However I'm not sure if it's really the right thing to run on Linux. Also it still doesn't use anything of the other stuff we can use in Linux. How much time do I actually have? 20 minutes. Okay so this was socket activation. There's also bus space activation, bus referring to D-Bus. Nowadays on a Fedora default install we actually install more D-Bus diamonds than actually diamonds that listen on sockets because we install all this stuff like udisks and policy kit and whatever not. And you can extend the same scheme that we are now doing for the socket based activation also for bus based activation. Meaning you install the bus name, a bus service, much earlier before you actually start the service. This has been implemented in some way in D-Bus actually for quite a while. System bus activation has been around there but in this case it was only focused on D-Bus being the only one who actually starts those services. With system D we extend that. You can actually activate a service not only by socket activation but also by bus activation in a couple of other ways of activation. For example hardware activation plugins and stuff like that. But yeah there is also the idea of on-demand loading. Meaning Avahi for example is, I hope many people who know that it's a service discovery, a service for the network. It's kind of useful but it's completely useless if you don't even have a network. So system D is designed to start services like this at the time they're actually needed. Meaning at the moment there's a network interface around because then you wanted to announce your local host name and stuff like that on the on the local network or when a local application uses this. And we can do that easily because we use hardware based activation to start Avahi in addition to D-Bus based activation. Meaning that when somebody asks via D-Bus I want to browse for these services that Avahi started in that moment. Or also via socket based activation if somebody resolve a host name with NSS. So yeah. Does that mean that when your hardware interface goes down, suppose I shut off my wireless it will automatically kill Avahi? That's a very good question. So in system D we were discussing these problems. Should we automatically always reference count everything every service we take up and if the reason why we booted it up is not there anymore, should we shut it down again? So basically we can do that but we don't do it by default. The idea behind that is we want to minimize our work. So basically we think that if Avahi has already started up then it's probably makes more sense just to leave it there because it's probably going to be swapped out anyway and hangs in a select loop so it won't really eat that many resources. And it's really difficult actually figuring out when Avahi is really idle when would be the right time to actually shut it down because Avahi would have to decide that itself. For example if there's still somebody using one of the of the of the D bus interfaces of Avahi then you probably want to delay that to the moment where he doesn't use that anymore. So yeah we do support it. You can do that. You can say bind this the runtime of the service to specific hardware or whatever and that will be shut down and it started and shut down in the right moments but by default we don't do that. The whole idea of socket activation that you first install the socket and eventually service follows you can actually extend to file systems meaning that you can can mount a file system via the the auto-mounter established in everything and everybody can can can use it or in the in the past are there but actually actually the backing file system is created or is made available much much later. This is implemented in system D where you basically can have an auto-mounter which is similar to socket in this socket activation case and which is then replaced when somebody access accesses this this auto of s mount point by the real file system. We actually use this by default on system D installations. We use it for all these kind of of exotic file systems virtual file systems that most distributions tend to to compile but not always load like for example bin format bin format miss is a kernel module and it you configure it it's it's something weird so that you can execute java binaries as if they're by linux binaries basically and monobinaries too so bin format miss is a kernel module normally it's not used unless you install mono or something like that and it's it's it's the configuration happens via file system that is mounted to proc sysfs bin format miss goes something like that. In system D nowadays we by default auto mount put an auto mount point to that directory and at the moment when application actually uses it we will we will mount the actual file system and that will pull in the kernel module to actually back this. The effect of this is that that all those in its scripts that previously did like modpro bin format miss can then actually configure something they don't have to do that anymore because they can just access bin format miss and the configuration just like that because it's always there so so the one entry point for this this thing is the pass and the file system hierarchy and nothing else anymore. We do this not only for bin format miss we also do that for a couple of other things like sysfs security and sysfs debug and stuff like that. That's another question but this can also be used for for any kinds of directories by the way. Sorry stupid question does this also does this apply to real file systems as well? What was the question sorry does this apply to real file systems e.g. you know an x3 or an nfs or something like that um but i didn't understand because it applied to so you can do you can effectively you're replacing auto mount but can you can use it for well we we are um i mean they they have always been auto mount diamonds um we only implement the subset of that we only implement direct mounts you can use it for everything you want you can back it with an nfs file system or whatever the idea of this is also that that if you if you boot up with multiple file systems like slash home on a separate petition you can actually run fs check at a point in time where you already boot up the the the further system um by simply let's say in the in the in the slash home case where you have slash home on a separate file system you could say okay i i install my auto mount point to slash home continue bootings and then samber and gdm and whatever else wants to access to slash home will just see the auto mount point if they access it they will sleep until it's backed at the same time the fs check for the slash home will still be running and then eventually when that's finished it will be the real file system will be pulled in there and stuff like that so um it's also or and that's what actually the slide it focuses on it parallelizes boot up you can you can basically continue booting um pretending you already had mounted all the file systems that are needed for the boot while you actually haven't while the fs check is actually still there still still on the fly or while you're still working working waiting for the really slow nfs server that you have on the internet um and it will just work so um it's of course kind of surprising that this init system nowadays has a has a an auto mount implementation in it but it makes a lot of sense because you actually can can really boot everything in parallel and you can and can extend this kind of activation this parallelization that socket bus socket activation bus activation allows you to to file system suddenly and it also simplifies things greatly because because suddenly all those those file systems all those kernel api file systems are just established and and just dance if somebody wants to use them you can use them the kernel modules are not automatically loaded they're only loaded when somebody uses it but yeah so so much about um um file systems how much time do i have left 30 minutes well okay um in system d we try to avoid shells um shell scripting um usually if administrators here that say oh my god i love shell i always learn shell and that's what i always did how can you say shell is evil so we say shell is evil because it's just evil um it's it's amazingly slow because what it basically does is for every single operation you do you spawn a process for example a copy file from this place to that place you fork a process called cp um and and if you look at the usual shell scripts they use all kinds of stuff they they use awk they use pearl they use what kind of thing they create big big pipelines every every single part of that is is usually forking a process and forking process is is is the linux faster than any other operating systems but it's still awfully slow especially if all you do is is crap for for a text with said whatever um so our intention with system d is to get rid of the shell scripting it's not it's not getting rid of the shell the shell's always going to be part of what linux is it's more about de-emphasizing it from the boot so that um currently every single thing that is started during boot is done via a shell script basically um you start services with with with the shell script um traditionally you you you you set the host name with the shell script and all these kinds of things and we looked at that and said well if there are so much shell involved um we start a gazillion gazillion of processes during boot and that's actually the case in in fedora 14 or something like that after you boot it up and when you then open the the terminal the first time we can type can open and you type echo which basically tells you what is the current PID of the shell and because the shell that you started there was probably the last process that was started it's kind of an indication how many processes get got spawned during boot up um traditionally it was something i don't know two thousand five hundred or something like that and on some distributions even worse i'm for example in in in su the they they used to do a lot of shells gripping for the weirded thing and the the number got even bigger nowadays with system d we are we're we're down to something like 500 or or even lower than that of which 200 or something are kernel threads but um this while while removing one process from the boot doesn't doesn't make much difference making them all go away um is quite a difference so in a in a modern system debut we actually managed to remove the entire use of shell from the entire boot process it's all um gone now now the question is of course what happened to all the shell scripts how can we get get rid of them i mean there was so so much shell code involved into booting how how can we do that and we looked at the individual problems and then then for example you look at the at the at the the shell scripts that usually are a system unit script you notice that most of it is just a copy of of something of the skeleton lsp script and as a shitload of it it's really long um it's really long and it's doing basically nothing it's always doing the same thing and basically um this is just a switch um loads a configuration file first that sets a couple of environment variables first blah blah blah then there's a switch um checks a couple compares a couple of of verbs and stuff like that and you look at that and say well it's always the same thing so as a programmer i probably should just write one place for it and never have to to to copy that into into a couple of additional shell scripts and that's basically what we did we um in system d services are not started via shell scripts they are started via simple any file like service files and in the ideal case you basically just need to write um two lines one is bracket um service the other one is xx start and it passes the pass that is actually spawn and that's basically it you can configure a lot of other stuff but you don't have to but um yeah that's that's basically how we managed to get rid of all the shell scripts a couple of things are still there that our shell scripts are probably going to stay or the shell for quite a while for example the the nfs user space is a horrible mess of shell scripts it uses so many shell scripts so if you use nfs you'll probably continue to using them but uh yeah if you don't use that you get boot times from about four seconds or something like that um on a modern system d system um yeah it's one of the of the designs we had in mind when we did system d and we came actually very far with that of course the boot is not just a system unit script it's also a lot of other stuff um for example the hostname is traditionally configured via shell script we looked at all these things and tried to find better places where these things like configuring hostname can be implemented in c where where it might make sense to move it instead and then for example the hostname is now configured in system d from p ad one as one of the first things when it boots up so basically before the first process starts we already have set the hostname and there's a lot of other stuff like like mod probe and in all these kind of things and we always try to find better places to do that very often for example for for many system v services the the shell scripts before them remove pit files and stuff like that and we thought well isn't wouldn't it be much nicer if the if the if the demo itself would be able to remove the pit file if it if there is a leftover pit file from before and we patched all this kind of stuff and to the effect that nowadays um there's no shell involved in all any more in this stuff and then something we moved into the kernel something we moved into you live um very often heard i'm and i'm surprised that nobody's complaining about that here yet um very often heard complain about this thing that we remove the shell from the standard boot is that shell is something i know as an administrator and i use it for debugging i just put in a set dash x or something in there and i see what actually happens and that's a valid argument but we say that shell is not a debugger shell shell is a shell shell is something for for scripting execution of their processes if you want debugging then use proper debugging facilities and we hence try to provide a couple of them for system d for example we have relative elaborate tracing you can you can just enable them during boot and you get long long output what actually is happening we provide you with graphical tools even where you can can can graph all the dependencies between the services you can get an interactive boot up where you get a good question basically for every service that is spawned if you actually really want to do that so we try to look at all these these these claimed issues that that's really hard to debug and actually sort okay is this what you want to do then we'll add a proper debug facility for you and not this this the shell thing which is not a debugger but you think that is a debugger so yeah and one other thing system d is i mean one of the most foremost things it's supposed to do is supervise services supervise processes so one thing you want to do with system d is make it the best process babysitter thinkable for this we use something called control groups control groups or short c groups as a kernel feature in Linux has been there for quite a while but usually was under the radar of most of the people we use it to actually supervise services so what are control groups control groups are basically something where you can group processes into a hierarchical tree and these groups can be labeled because they're basically just you mount you mount a special file system the c group of s and then by m cut ear mk dear you create a group and by echo pd into a special file and that directory can add a process to a group and stuff like that the original intention of c groups was to to make it possible to to apply to apply special rules special resource limits and and similar things to to set up processes the idea is to to to the background was originally something like containers so you had a couple of containers running a new system and you wanted to say this container of this customer can use so much memory and this other container of the other customer can use this mass disk base or whatever actually not this space but let's say cpus however control groups are completely abstract they they exist in the kernel without actually having to have any kind of resource limitation or anything applied to them which is really useful for us because we can use them to name processes the nice effect of this is if if a process that's a member of a c group forks and forks a couple of children all these children will be member of the same c group and they will still be labeled this enables to do as a lot of amazing things for example if you if you have a patchy and a patchy sponsor shitload of cgi scripts they will all be part of the same original apache c group and we can actually trace them back which is actually a real problem with classic unix systems because on classic unix systems if you start a patchy and a cgi script of your customers whatever sponsor a forks a couple of times and then the the the middle children die and they rename themselves and stuff like that and you have a really hard time figuring out oh did this process actually originally belong to a patchy did it get started by a patchy and insistently all this problem all these problems go away because we start a patchy in its own c group all these cgi scripts are children of that c group if you type ps um with some weird parameters slash uh dash e o c groups whatever then you can actually see that because because the kernel will maintain this information that this process originally belongs to patchy um for all the children created it also allows us to us to really do fancy things like like for the first time system d actually allows you to to kill services properly because i mean most administrators would probably say yeah if you wanted to kill the service i and let's say it's cron then i type kill all cron and then some of you will probably notice it's probably not the the best way to do because if some user accidentally called this process cron then it's going to be gone as well so kill all is not the nice thing then they say okay then then i read the pit file and use that but it's still not very very um correct because um because uh um you suddenly have the problem like began with a patchy that that there might be gazillions of cgi script spawn and you can't kill them however the control group staff of the kernel actually gives us for the first time the power that we can kill every single member of that c group and we kill can kill them as long as there are members and then eventually they're gone so i can probably talk from hours and then i have a sheet a lot of additional slides for that but we probably can should come to an end here and if we still have room for a couple of questions then um please ask them now or even later you said with uh system d we can start a lot of stuff but you didn't say how we can stop them or we can check a status because you said there's no more in these scripts or how do we do that so so there um for example there are very many ways how you can actually supervise processes with system d for example one simple thing is you can actually use ps because as mentioned ps for example will actually show you to which servers it belongs so if you want to know if a patch is still running or if there anything of patchy left you can just just be asked however we also of course provide our own tools like um the most important tool with system d is called system control you can type system control status a patchy dot service which basically means give me the information about a patchy and then we'll tell you is it running is it currently in the process of being started is it stopped um what are the processes that belong to a patchy which is the main process um it will also give you a lot of information that was previously not available for example um system d records if if a process dies if a patchy dies what the exit code was it died with which is something that's completely lost right now if a patchy for example zack falls traditionally on unix nobody would ever notice that it would just go away the the inner system would eat up this the the return value and then that's it and and system d this would be it's actually always um stored along with the service information and if you type system control status you can can see it and you see the exit code and everything and you and eventually we probably want to link that up by the way to a board like this the crash report system we have an amphidora and then you can actually just click on it if you see it's crashed and it will get you even further than that we can do one last question this will have to be the last one so this sounds great um problems none everyone seems to agree debon has one ubuntu has another one this is the third one i don't really care who was first it's not important but the point is there the manuscripts are going to be written in different ways for different distros so so what's the solution to this the solution is that um the way things look like right now for system d um everything is very very bright and with the exception for ubuntu we already have convinced right now every every major distribution um fedora is going to to to switch with fedora 15 open susus probably one one iteration later they already have it in the distribution mandrieva has announced it um a couple of weeks back that they're doing the switch for the next release um mego has decided that in the in in in that discussion that they are going to switch um it's already in debon it's in gen 2 they have difficult is adopting it as as default because they have huge development cycles um but it's it is in all the distributions and all the the distributions that are capable of making decisions have decided to go for system d the exception is upstart um which is still going to be used on on on ubuntu for a while well i'm not going to discuss this in public too much but we have hope that's going to change eventually too um scott remnant who was a canonical employee who created upstart recently left the company went to google he still um claims that he was wants to push that in in ubuntu but we actually made quite inroads into ubuntu to convince them to go to go system d as well at this point of time system d does everything that upstart does and a lot of additional stuff so um the the the only reason why they're still staying was abstracted basically because they have invested a lot of time into it and they just don't want to get rid of it right away but um i don't think i mean given that that canonical is not really into doing too much development they basically just take what the other people do for them um i think eventually um they will notice that it's not it's not worth it trying to to continue with upstart and always trying to keep up with them because right now we are leading and they are following so yeah and i don't expect them to to switch in the next year or something but i'm the way i see it i kind of assume that in a year or so like and then the version that comes then they will do the switch thank you very much lennard and put your hands together from those an excellent talk for those of you not aware um he also plays around a lot with pulse audio and you probably love to talk to you about that too so in the meantime there's a nice little husk bowl for you to put your little trinkets in in appreciation from