 Hi, I'm Leonard Petterling. I'm one of the system-D developers. Today I'm going to talk about one specific facet of the system-D project. System-D is this big suite of different components that you need to build an operating system from. And we'll focus on one little component of this, which is incredibly useful, but very little known. And everybody should know about it. I've been frequently doing talks about system-D. And usually I try to do the big picture talks. This time I start about just talking about one specific little tool out of it, because I think everybody should know about it, and particularly administrators, or people who build software, or people who test software, people who, I guess everybody what nowadays is called DevOps people should know about this, because it's actually incredibly useful if you deal with containers, if you play around with building container images to then ship and so on and things like that. But also if you're a developer and try to quickly build something for a different distribution or test something on a different distribution or things like that. Usually I talk relatively fast. If you have any problem with that, please show up and I'll try to slow down. If you have any questions, please interrupt me right away, just show up. I much prefer if we could make this an active session rather than something where I just keep talking and then everybody asks questions in the end. So yeah, interrupt me. Yeah, I assume that everybody of you knows that change would come out on Unix? Like who knows it? Sure, it's actually the first time I ever asked people a question like that on a talk. So let me rather ask, because everybody actually showed up like does anybody not know change route? OK, everybody knows it. So I guess I'm at the Linux conference here. So yeah, system.yanspawn is basically the same thing as change route. It's just so much nicer. So yeah, let's start by discussing why it is actually part of system.d. So I mean system.d has mentioned this nowadays. I mean it started out as an internet systems and most people probably considered one and that's the end of the story. There's actually much more than that. It is for us basically this suite that we build an operating system from, which I already mentioned. So it will contain everything that you need, like that we think are absolutely essential for building of most operating systems, regardless if it's for embedded, for servers, for desktop machines. Just a couple of components that we think that deserve being nicely integrated and make use of what the Linux kernel provides and work well together. While doing this, of course, we look at the various components of the Linux kernel that already exist and try to figure out if they're actually in any way relevant to system.d or not, or if there's some specialist feature. In some cases, especially in the container case here, we saw all the container stuff is actually not that hard to use from what the kernel exposes and we would like to integrate it more with system.d. However, currently there was, like when we started looking into it, there wasn't really such a nice tool to play with it. Container stuff on Linux usually means LXC, or means Lippert LXC, which by the way are completely different projects, and maybe OpenViset and these kind of tools. All of those are written primarily, I guess, for distribution of things and servers. They are relatively comprehensive. They expect a relatively high degree of understanding of what's actually going on. They are, in a way, way different than good old change routers, because change route, everybody can use that. Every administrator on his home machine, if the machine doesn't boot, or if he then boots up with some USB disk and then wants to get back onto his home machine, he's totally capable of using change route and he just knows how it works. These other tools are certainly not that like. So because we actually develop operating system, we thought, OK, we need something like a tool that just makes use of these kernel facilities and makes it really that easy to use just for our own testing purposes in the beginning, because we wanted to make sure that the entirety of system should work equally well on a bare-bound system, in a virtual machine, and in a container, and should just work. Like previously, if you ran system near something inside of LXC, you probably, or especially as this file in it, you had to make quite a few changes, like you would have to disable the Gettys and these kind of things, because we wanted to make sure that systemly ran nicely inside of containers, we actually were looking for a tool to actually test that. And so we actually started playing around. OK, LXC, kind of nice, but very complex. Like for example, it exposes all the C-groups attributes and all the complexity to the user, even though in many, many cases it doesn't make much sense. Like for example, they expose, like if you want to grant access or take away access of a container to a device node, you specify that by a major and minor number. But the thing is major and minor numbers are not stable in the current kernel, so you cannot actually write a configuration file that will survive a boot. Because basically the assignment of the major and minor numbers is not defined. It's there on every boot it might change. For a couple of devices, like DevNull and DevZero, and these kind of things, it's stable. But it's not for hardest, for example. It really depends on the order of how things get probed by the kernel. So yeah, all these tools appear a little bit too low level and knowing the back end of it, like the actual kernel system, because we thought, OK, we can just write a tool that does exactly what Alexi does on the lower level, but it's as easy to use as change route. And that tool is what we call system.nspawn. So it's called the nsense for namespace, because it makes use of the Linux kernel namespace feature. And spawn means, yeah, because it can spawn. So that's a little bit the background story why we added to system.d. Now people could assume that it's kind of like a competitor to Alexi or Lyppard Alexi. And we clearly want to tell everybody it's absolutely not. We push nspawn, like we created nspawn for our users, and we try to get people to use it for testing, for debugging, for profiling, and for building things. If you actually ship something on the server, you should not use this. It's not the tool that you want to use there, because it doesn't have any management things and stuff like that. It is such really that thing that, like the same way as you would not, or probably not, I hopefully not, ship something on the server that just uses the user bin, a change route command, the same way you shouldn't ship something on the server, really, that runs nspawn. So in that regard, it is just another tool, and the powerful tool side that you have nowadays to run containers and Linux. Yeah, if Alexi, I mean, just a little bit of background, but if you want to know more about Alexi, Lippert Alexi, and the kind of things, and probably the other conference that it's in parallel is probably more interesting. But just a high-level overview, like what people, in case you do not know that yet, what people always confuse the fact that Alexi and Lippert Alexi has nothing to do with each other, that basically they use the same kernel interfaces. So since nspawn also uses the same kernel interfaces and does not share code with either Alexi and Lippert Alexi, basically the relationship between Alexi and Lippert Alexi is the same as between Alexi and nspawn, and Lippert Alexi and nspawn, which is there is no relationship except that they use the same underlying kernel APIs. I think actually here LWN has actually pushed out a couple of articles about the underlying kernel stuff. So if you actually want to know what's actually below Alexi, Lippert Alexi, and nspawn, you can read those articles. Yeah. So again, it's not about shipping something on your server. It's not about actually using that in production. It's about building stuff. It's about testing stuff. It's about debugging stuff. And it's about profiling stuff. And there are actually a couple of distributions which nowadays use that for setting up the build environments for packages. Because it basically was very little work. You can set up your container, and then you can put it up and it will just work. So the entire deal of nspawn is that there is no configuration. The same way as there is no configuration for change route. Like if you want to start your change route machine, you don't configure anything. You just invoke change route, and then the directory, and I will change route into that. The same as the idea with system.nspawn. There is no configuration. It's the thing that just works. Well, sorry, at least. The thing, though, is that to make it just work, a couple of bugs currently left, at least on the Fedora 19, or actually the Fedora 20 that I'm using here, and we actually run into two bugs that are still a bit nasty, but I will tell you how to work around them. And I assure you that probably F21 at least will have all of them fixed, and the final version of F20 will have most of them fixed. So yeah, it's a bit, basically telling you that it requires no configuration just works actually doesn't. But yeah. Again, not fully applying. You'd live with Alexey or something for that. So the essence, basically everything I want to talk about in this talk are these two commands, like the upper one and the lower one. The upper one doesn't even have to do anything with system.nspawn. It's just the configuration line that you need on Fedora machines to actually create a minimal Fedora installation inside a subtree. So it uses yum. It uses dash y just to say it should always say yes. You pick a version. You say we don't want signing because I was too stupid figuring out how the signing works if you build a container. You say where you want it installed, like this is the root for the container. You say, yeah, basically it disabled everything except for Fedora itself. And then you install the base packages and that's it. After you've done that, you actually have a directory there. And it looks exactly like a normal Fedora installation. It doesn't have a kernel though, right? Like these are the dependencies of these packages what you installed, not a kernel. And that's good that way because for a container you don't need a kernel. And then, yeah, of course, this is the Fedora specific command, which is yum, of course. The equivalent exists, of course, for Debian where it's that bootstrap. And there is something for Arc Linux and there's something for the other distributions. The man page of system.ly and spawn will actually help you with that. And we'll just show you the one line that you have to type. So if you're actually looking for the counterpart of this for your specific distribution, just check the man page. It's all in there to make it really super duper easy for you. So after you type that, you have the image. And with this command, you can actually then start it. It's looked a lot like change. Would you specify the pass to it? The dash D means specifying the pass. If you don't specify the dash D, it will basically start the operating system that's kind of working directory. And the dash B, the lowercase B, basically means that the machine should be booted up rather than just getting a shell. If you traditionally just use the change would command, it will just give you a shell. It will not boot up the machine. So if you look at this, the basic difference at this point is mostly this can be used to actually boot up the full container, as if it was a real machine. So now we have seen these commands that actually try this because I didn't want to rely on the network here. I actually already did the first part. So you have to trust me on that. One of the bugs that they actually still have is actually in Fedora so that actually this command will break you something with slash var and slash run. But that's really fixed soon. But anyway, it's really the goal that this actually will work. So let's do this. I have prepared, as mentioned, this YAM command to get the directory set up. I hope everybody can see the font, I think so at least. I prepared that in my home directory in the Fedora tree if you go in there, then you actually see, yeah, it looks like a real Linux in there. It's just sitting in my home directory. It's created by YAM. I actually have the same thing here because I need that for testing with Debian. So in the Debian tree directory, I have the equivalent with step bootstrap. Since step bootstrap is actually packaged for Fedora, I could just do that trivially on Fedora. Just as a thing that I mentioned was the man page. If you go to the man page and go to the end of Fed, then you see the examples. See the upper one. The upper one here is the YAM command that I just showed you, unfortunately truncated here. But we can scroll to the right and then you can see that. Here you see the equivalent commands for Debian. Then you have the equivalent command for our Linux. And the other one is something different. But anyway, so just to show that. So I've already prepared that. So then let's do that. Let's actually invoke, and you see that you can, this directory. So I'm using, since I installed into a different directory, I have to slightly modify the command and then actually specify the real directory. And then I can, after, this has to be done as root. The same way as change root has to be done as root, because the functionality of the kernel is not available if you're not root. Then we have a syndicated. And there we go. I'm already logged in. See, if you look at this, and now I actually started the Debian machine, sorry, because I'm specific about the Debian tree. So sorry for the confusion. But anyway, you see here suddenly you get the normal method that you would get if you boot Debian nowadays. You're like, welcome to Debian to do Linux, as he said. It's a relatively old version of SystemD running inside of the container, like in this case, SystemD44. It's ages old. But there you go. You have your virtual machine. We can power it off again. And the thing that I actually wanted to do, which is start the Fedora tree, where you get it like this. And if we scroll up to see what actually happened, it's a little bit cleaner here. You get the welcome to Fedora 19 thing, what you see on a normal bare bone machine as well. And yeah, this is just because we didn't specify anything. So try to boot actually into graphical mode. But it doesn't really matter, because there was no GDM installed. And it doesn't even work having GDM installed in there, because there's no hardware vibe in the virtual machine. But anyway, and then you see the normal boot process of Fedora, and then you can log in, and you get a shell. Now you are inside the container here, so if you type PS, then you actually see the processes here. And because you don't see the host processes, you only see SystemD here, JournalD, Debus, SystemLoginD, and your actual login shell. Otherwise, there's nothing installed, and so you can't actually have anything. Since this is a full container, you can install software then inside of it and do whatever you like. Yeah, it's a full blown machine. I can power it off. OK, this is actually one of the bugs. It's currently in the kernel. Sorry. Can you say more about this bug? About this bug? OK, so I mean it's a little bit off topic, because this bug shouldn't exist, but it's there. It's a bug in the kernel. It's basically we, in some cases, don't get the notification of the kernel that a container really germinated. And because we don't get that notification, the container basically stays around all the time. So in SystemD, every container that is running, regardless of actually if it's an N-spawn container or a Lippert LXC container, carries a name. That name is attached to all the processes of it so that you can actually see it with PS and things like that. And that name is never freed again. And if you do not specify what I'm adding here now, dash m and some random words, then it will generate this name for the container from the directory that you have specified here, which in this case it would take the Fedora tree stuff. But since the Fedora tree container didn't get cleaned up because the kernel sparking didn't send us a notification, we just pick a completely random bullshit name here, which is in this case is fffffff, and then we can boot it up again, right? So yeah, it's unfortunate that there's this bug. And I need to isolate a bug. Like I've talked to Tejun, who's the secret container where this problem lies, and he's kind of waiting for an isolated test case for this so that we can actually get that fixed. Let's just do that. So let's first do that inside the container. OK, we don't have PS3 in the container. And I don't have network, so I can't install it, sorry. But let's do it outside of the container. I hope I have it installed here, yeah? So there we go. It's a bit hard, a bit difficult to see here. Let's try that through less. So that's where you see it. You have the GNOME terminal of Bash. Where is it here? It says in the N-spawn. You see that these are child process of that. I can fully see inside. Yes, it's basically like any other. I mean, that's not a particular property of N-spawn. That's a particular property of all the containers that we have on Linux. Basically, everything that we run in container to the outside appears as if it was addressed in normal child process. Of course, what you will notice, I mean, you don't see it here, but the PIDs will be different, of course. Because this process here is the PID1 of the container. So everything appears differently. So if we type PS on the host and pipe that into less, then you will first see this is the system D of the host. But then if you actually go down, so where is it? There, you see the other system D. And this one has something that is not one. But if you do it inside the container, which we can totally do as well. And then you see that's actually the same process, this one that's called PID1 here from the outside, something different. But actually, since we actually looked into this PS tree, the PS tree is, it doesn't really show you that this thing is a container. We offer an alternative to PS tree, which will not show you the hierarchy of the processes, but will actually show you the hierarchy of C groups. I'm not sure if you guys are aware of the C groups feature. If you are aware what that is, then I'm sorry, because I will give you a little bit of introduction to that. C groups is basically, it's a much-discussed feature of the recent Linux kernels. It's basically something where you can take a couple of processes and make a group out of them, a C group out of them, a control group out of them, which is basically just this group of processes. You can attach a label to them, like a name. You can organize these in a hierarchal fashion. So you can have a tree of groups. And then you can optionally apply resource limits to them. It is one of the basic building blocks of what Linux containers are built of. You take the processes of the container, you put the label on it, you possibly add resource management. That's basically what C groups are. C groups, as mentioned, are organized in hierarchy. And we can actually easily show that. SystemD comes with a tool called CGLS, meaning C group list. If we type that, then we actually see the C group hierarchy of the system. It's an incredibly useful format. Come on, it really reminds, it looks a little bit like pstree, except that it totally does not. So you basically see all the C groups here. In this case, it's a slice. I'm not going to go into detail what a slice is, because it's very systemally specific. Let's just say every user gets a slice, every log in session of the user gets a slice here. Yeah, so every user 1,000, that's me. And then I got a session. And these are all the processes that are running inside of it. And if we then go down, they are the process that is. That's true. So and here, you see the machine scope. That's actually the thing that we created. And there you have actually basically the subtree here that you see from there to there. It's the same thing. That is the entire thing, just for the container, if you follow what I mean. So because I'm logged in once, I have one user thing. And I have one session inside of that. You will see that. Then in system, you find everything that is basically a system, everything like diamonds and stuff like that. And for every diamond, you have one C group. And inside of each of those C groups, you have the process of this diamond. And that's in this case pretty boring, because it's just one process each. But anyway, so the takeaway of this is, if you run this, you can type system.dcgls. And we'll actually show you the hierarchy of the system, how it consists of login sessions, and machines, and services. And then you can hierarchically go down for that and go further down into the individual containers. System.dcgls, again, is no way tied to system.dn spawn. It works for any kind of C group setup. And it's particularly interesting if you have containers, because then you actually can go down. We need to go further down and stuff. But it's an incredibly useful tool. There's a similar tool, actually, called system.dcgtop. So Cgls just gives you a plan list of all the C groups with what's contained in that. System.dcgtop will actually show you the resource usage of this. This case, unfortunately, because the window is too small, can't really see the stuff that's inside of it. What the fuck is that? Yeah, like, oh, that's a good explanation. Anyway, I think this goes a little bit too much into a different topic, which is resource management, because it will actually show you the resource management, like the resource usage of these services. Because we don't have any resource usage in the container right now, because there's nothing actively running. You don't see it in the CPU thing there. So it's kind of a stupid demo that I just did, because you actually don't see the container there. But anyway, let's ignore that for now. Let's just focus on Cgls and how awesome it is, and actually shows you the recursive things of everything. Yeah, let's go back. OK, now I have to terminate the other container, this one. OK, I first have to power off the container. So by the way, you can even reboot and all these kind of things. It will just work. It will just appear as everything. Oh, something that N-Spawn, by the way, is very good at is that it will automatically inherit the network configuration of the host, which is systematically different from how LibVert, LXC, and these kind of things usually are set up. Because basically, yeah, we copy the ETC resolve conf of the host automatically into the container so that everything just works. We do a couple of other things to make things just work. Like for example, the time zone is automatically synchronized from the host into the container, so that the timestamps look nicely in the container. You don't have to set anything up. The idea, again, is that with that yam command, you set up the container. And with systemDnSpawn, you can just boot it and it will just work. Let's go back to the, OK, now I have to do this. There we go. Yeah, again, this is like the most important slide of them all, because it actually tells you how to do this. If you're wondering how, again, just remember systemDnSpawn, go to the man page. You find this line, and you find this line. By the way, again, if you guys have any questions at any point in time, totally interrupt me. OK, the next command I would like to talk about is the machine control. Machine control is, again, something that is incredibly useful in context of systemDnSpawn, but is not bound to systemDnSpawn. It's supported with Lipford LXC the same way. Basically, in systemD, we wanted to make sure that containers are nicely integrated into the rest of the West. The basic idea there is that on Solaris you have these concepts of Solaris zones, which are relatively nicely integrated into the rest of the Unix commands. Like, for example, you can list the services on your local machine, and then you can actually recursively go down and also see the services with the same command of all the containers that are, all the zones that are running on the same host. We are interested in providing the same kind of integration on Linux. So far, the container worlds and the rest of the operating system were pretty isolated. We wanted to make sure that they're nicely integrated. Nicely integrated, for example, means that if you type PS, where you, like, you know PS, right, like where you see all the processes, the name of the process, PODs and things like that, and you can show all kinds of attributes, we wanted to make sure that you can also show the container name that a process belongs to, because after all we have seen the PS tree, you have all these processes all there in the host, and you see this gigantic tree of things coming from all the containers. What you really would like to know is what it actually belongs to. For that, we created a little bit of a mini-down, like mini-services that just activated if people need it. All it does is make sure that if some container manager, like Systeme and Spawn, or like Lyft.LXC runs, it will tell the system that, the system will remember that, and it will provide this functionality for PS and things like that to actually show you those columns. All that works already, at least if you run Fedora 20, so that you can actually, yeah, if you type PS, you can actually get the columns for, for to which container that works. Again, this works for Lyft.LXC containers as well. However, I can show you how this works here for and Spawn because it's so much easier to do. If we type machine control right now, because of this bug that I mentioned earlier, the machines will not have been cleaned up. So if I type this, you actually still see them running there. The reason why we call this machine control, by the way, that probably deserves a little bit of an explanation. We wanted to make something work that is capable of recognizing processes which belong to containers the same way as the recognizing process which belong to virtual machines, like QEM or KVM or something like that. So we try to come up with the names that are generic enough to cover any kind of virtualization technology, regardless if it's more like virtual machines, like running a kernel inside of the thing or more like containers running the things on the shared kernel. And the name we came up after looking for a long time as machine is basically the generic thing for that. Yeah, in this case, you see the list. As mentioned, they're actually dead, but because of the kernel, we didn't learn that they're actually dead. So the original one was Fedora tree and then renamed and started another time with FFF still running. This one says basically this column is if it's a container or a VM, since we did not start a VM, we don't have an entry there. So everything says container and then services, like just a little bit of information about what actually started that. So machine control can give you a list of the current stuff and these, like if you type PS, but actually I don't know the problem is I would like to show you that with that, how that looks in PS, but I don't know the precise command lines. So yeah, let's see that it's a machine column. So let's try that. I hope this will work. So eight that, so then let's start the container again. And let's give it another name. And so this is this thing running now and now let's try PS. See, and okay, it doesn't look that nice because I manually put together a couple of columns here, but if you use this, this is standard PS again, you see the, we say we want to have the list of the processes with the PID in the first column with the arguments in the second column and with the machine name in the third column. And then we see, okay, these are the ones that actually belong to the machine. If we run a couple of containers at the same time, this of course gets more interesting as we will have more processes that actually have listed something in the third column. If there's nothing listed in third column, this means that it's eyes are not assigned to a container, it's actually off the host, or that the software in question doesn't actually tell system D about machines. Libvert LXC tell system D about machines, not Libvert LXC, like the non-Libvert LXC, the project that's just called LXC doesn't do that to my knowledge as far as I know. I guess in a way, yeah, we work very closely together with the Libvert LXC guys, the LXC people are a little bit more distant, so the updates don't happen that quickly. Anyway, so machine control, let's continue with that. If I type machine control right now, we see the three machines. Actually, one of them is currently running. Again, the other ones are because the kernel didn't tell us that I stopped. You can actually use a couple of commands on that. For example, we can do machine control status on this third machine, because we run it, it's the most interesting one. So, and if you run that, you actually get a lot of more information. For example, you know the PAD of the, who started that, like who's the PAD one of the machine. In this case, it's system D, because the container internally also runs system ES. In its system, then you see that it was N spawn that registered this, and you know that it's a container and not a VM. You have some information about what the router actually actually was. And you have, there's a C group contents, basically, with everything that's running inside of it. So, all of this is also viable as a lower level command with machine control show. So yeah, the stuff that is exposed to you is actually, the machine control is nothing but, I guess in a way, a test case for us. Because we provide APIs, how people can actually enumerate the machines that are currently running, like the containers and version machines. And this one is, this tool is basically just the front end for that. There's also a couple of diverse interfaces, like you can, it's basically like, yeah, it's a generic way how you can list, regardless what machine manager is used, what containers are there, and actually go to a couple of operations on it. Yeah. So, that's machine control for you. Well, I'm not logged in into this one, but if I log in into this one, and shit, and now I go to the shell again, and I do the, where did this command go? Anyway, let's do that again. So, I actually see the scope there. It's just about logging in, actually. Oh, it is. This is a slice and this is a scope. Okay, so what might be misleading is the fact that here it doesn't have a slice thing, that's because it's a slightly old, filler 19 and not that old 20. Yeah, machine control is a very powerful, no, actually, if you use directly, very useless, but also like it basically exposes the subsystem to other programs that are incredibly useful. The same information is actually also viable in GNOME system monitor, like this PS-like tool that GNOME has. So, if you ever feel like browsing your containers with GNOME tools, then you can totally do that, and we'll see the same information. Still no questions at this time, or that question. Yeah, it's basically, it's a tiny diva service that just keeps track of what N-spawn and libvert LXE and potentially in the future LXE tell us, so that we then can hand it out to PS and all the other tools. Like our goal with something that we're probably going to add very soon actually is that system control, you know, system control is a primary interface, how you can start and stop services, learns about containers the same way. So basically that you can just issue system control dash M for the machine name, and then you can issue commands instead of on the host on a specific container of your choice. And also the same thing that I mentioned earlier with Solarisone, where you could get a recursive list of what's running on the host and what's running in the individual containers. That's the same thing that we then will provide basically by simply numerating all the machines that are running and going into everything or one of them and inspecting the state. And then, yeah. Yes, well I mean to system the inside. So basically the machine control that you run on the host is affronted not only for the system D on the host but also system D on the many containers that you might be running. So with that we have something that starts to feel a little bit like what Solarisones can do. Yeah. Again, there's this three bugs. One you already mentioned that you saw which is this kernel bug that we don't get the notification as a secret branch entity which is definitely fixed by the time Fedora 20 is out. The second one is the audit system is still incompatible with this entire scheme. So you actually have to turn off the audit system in the kernel. If you don't do that, then you cannot lock in into your containers. It's because audit is doing weird things. There's a patch for that, but it's not fixed yet. To turn it off, auditing, you don't have to recompile anything or so. All that you have to do is that you're on the kernel command line specify audit equals zero. Again, this is something that will be fixed soonishly and the patch already exists. And what was the third bug? Hmm? Oh yeah, the bar time thing in Fedora. Like I mentioned that already, like the yum command will currently not create a working image for you because it forgets creating a simulink from var run to run which is the bug that Fedora needs to fix. But with that one, yeah, if you have all these three things fixed or worked around then everything works without configuration, yeah, right? So yeah, I'm making basically the promise here that everything will be awesome in the future but it's not yet that awesome, yeah. Another thing that is actually kind of nice to know if you run system.d and spawn then you can actually, if you run it with a default command line it will actually create a sim link between the hosts journal directory in case, I mean, does anybody not know the journal? The journal? Everybody knows the journal? There's somebody who doesn't know the journal. So the journal is basically, it's an indexed logger that system.d includes. It's basically, I mean, yeah, I can do talks about journal of its own but it's a very useful index database of everything that happened on the system and it's a little bit like classic Unix syslog except that it's indexed and it's, yeah. It is, what we can do with a journal what you can't really do that nicely with syslog is that we actually can interleave multiple journals from different hosts with one command. So what we can actually do here is, okay, this is actually, I hope this actually worked. Okay, this was a stupid idea. Okay, it didn't work in this case. Sorry for that, it's probably a permission problem but I'm too lazy to debug that now. So if you, yeah, that didn't work. Anyway, that's a permission problem, I'm pretty sure. Maybe we can work around that permission problem. Anyway, the idea basically is that the same way as we try to integrate system control of the host so that it can issue operations not only on the host itself but also all the containers that we can do the same for journal control. And so basically there's this journal control switch dash M and that means merge and it basically merges all the journal files that of containers running on the system into one single stream. So if you use journal control dash M, you basically, you can look at the stream of logs the same way as you would look on a stream of logs from your local server but it would actually include everything that happens on the system regardless in which container and things like that. So system control and spawn supports that. It will actually set that up by default except that it didn't work in this case for permission problems but I'm too lazy to debug that here in front of you now. Something else that might be a nice theory to know. This in the end spawn as mentioned by default make sure that the network configuration of the host is inherited into the container simply to make it as easy to use to get started. And that in this case means that it will, we are like all the network interfaces available on the host are also available in the container and ETC Resolve Conf is copied from the host into the container to just work. You can also use in a different mode where you basically specify the command lines and just system the end spawn dash dash private network. If you do this here you basically, yeah if you specify this stuff then instead of actually getting access to the real network you get a private loopback device and a little else so you cannot actually get out of the container which is ideal for build systems because build systems usually download everything from the internet to set up the images to build things from and the sources from the internet but then again while they build they probably shouldn't phone back to the vendor who did them. So you can actually turn that off nicely. Just because we see this here currently a couple of other features of system the end spawn like the D and the B is the ones that you already saw that you actually can specify the directory you want to put. If you do not specify the B then you get a shell. I can quickly show that I guess. Do we have, so let's pick an even different name. This was with the B was where we actually boot the entire thing. And if I drop the B and change an even different name then I got a shell, right? And in that shell there's nothing running except dash and PS. So it's basically running. It's basically as if you would boot a real system within it equals bin s h on the kernel command line. But let's shut down this thing again and have a look at this again. So dash u you can actually change to different user ID inside of the container which is sometimes useful for executing stuff. By the way you can actually add a couple of commands here to the end like if we do this I could actually invoke ls at the end. Even different name and then you get the output directly back. Let's see other stuff that there is. The thing that I always use to pick a different name is the dash m thing where I just picked any name for the thing. Again the name is that stuff that machine control shows you when listening the machines and what PS shows you in the machine column. Slice is something that is relevant for, like a slice is basically a concept that system recently introduced in context of the C group rework. I'm not going to go into much detail for that but basically allows you to, you basically can have slice off the part of the of the available resources for a particular customer and then you can assign services and machines to that slice and then everything you can apply resource limits to that slice and then yeah that's basically counted together. So with dash s you can actually specify the name of that slice. Private network is a thing that I was just talking about. Read only is an interesting one. It makes the root directory as a container read only which is kind of powerful because it allows you to yeah limit what the container can do and you could actually relatively easily run a couple of containers from the same directory. You can actually do that. So there's the bind thing here. With bind you can like bind a directory from the host into the container and then there's bind RO which does the same thing but it's read only bound amount. So you can actually, I mean either you specify a single pass and then that pass will be taken out of the host or it's still available in the host but it will taken from the host and made available the inside of the container in the same place or you use the colon and pass the second thing and then we'll place it in a different thing in the host. So for example, if you wanna make your home directories available to the container which it won't be by default then you can just pass dash dash bind slash equals slash home and then you have the home directory. Of course this is not as good as it sounds because the use IDs might not match, right? So because the password database in the container is not the same one as the host, likely, not necessarily you can of course have synced that manually but if you have not synced it manually then they won't match and then the home directory is not as useful as it sounds but it will probably still useful to do this like dash dash bind RO. The link journal mode is what I said here is that they actually get the, okay, now I know what I forgot, I never actually specified the dash J. So let's give it a cert try that I actually boot up the machine or force try it actually is by now. So and then let's, oh by the way, one nice thing here is actually that the machine name that I specify we actually pass on as the host name of the container later on. So if I on dash, specify on dash M any name I like we'll actually P here as the host name, yeah? Because basically like Linux containers support host name virtualization, we just pre initialize that like, like we just in N spawn we know the name and then right before we hand off to the init process of the container we set the host name. And then the system, the inside of the container is actually smart enough to detect whether the host name that is already said is a valid one, like in this case or is just the default name that the kernel applies which is none, right? So yeah, we try to be kind of smart and make sure that this name of a container is something that transcends all the layers of our stack, right? Like so that it's a viable, it's originally supplied by you on the command line, it's on one hand passed down into the container to be used as host name and on the other hand passed to machine D and machine control we actually get a list. So how much time do I actually have? I think I've got 50 minutes so that's five more minutes but I'm mostly done anyway. I actually wanted to show you the merging thing and there it actually worked in this case. So now you actually see a merged output here of the journal control of the journal logs. So basically this one is Delta is my laptop and this is the container that I just started and you just see that intermixed here. Then this is again my host and in this case it's kernel message because I said Linux because we mounted a couple of file systems in the container. Yeah, it's kind of cool actually that you was this command basically journal control dash M and again is the thing that merges all the machine all the registered machine journals into one and E just jumps to the end of it we'll just show you everything in one stream. Yeah, I think I already mentioned that that we wanna have nice integration with system control as well so that it can apply all your operations on a couple of containers. Again, this is not specific to N-Spawn but N-Spawn is the thing that we use basically for development but usually Daniel Bernier who hacks some lipid LXC is really good at very quickly adding similar integration into lipid LXC. So basically our goal there is yeah use N-Spawn on your local stuff to debug test build and sync like that and then ship it on lipid LXC but have the same kind of integration available in PS in system control in machine control and all these kind of things. And that's all I have for now. So if you guys have any questions, shoot. I'm sorry. It's a very different tool, right? Docker is something where you can actually create operating system images, hand them out and spawn as a tool that's just changed route on steroids, right? So it's a different level on what you're trying to do. N-Spawn in a way is very close to system D so it actually makes use a lot of system D functionality. You cannot really take N-Spawn and run it on something that's not system D. The goal of the Docker people is to be compatible with everything that ever existed on the sun. Yeah, it's a different thing. Docker is the focus is pretty much on having something like a vision controlled file system that you can hand out to people. But yeah, I mean, again, I do compare N-Spawn to LibVert LXE because it's very similar. It just has different properties. One easy to use, a trivially easy to use and the other one more like deployable. But Docker is a different story, right? Like Docker eventually will probably, at least on Fedora machines or so, but I don't really follow that closely but it will probably make use of LibVert LXE eventually. But yeah, I hope that Kaina answered the question. No. Yes, currently at least. I mean, we'll probably add a couple of more things because we want to, I mean, again, it's not necessarily what you want for testing, building, profiling and debugging. So it's not an out to do list way up but we probably will add a little bit of it. We'll probably never add the mismask rating and all kind of other stuff but we'll probably add support for VATH stuff simply because we want to play around with a couple of things and because they're currently problems with, so our model how we consider OS containers to look like from the inside is basically that they never get access to physical hardware and that most particularly results in the fact that UDEV, right? This device manager does not run in containers, right? SystemD is configured that way that will automatically skip execution of UDEV D in the containers. But if then software comes and wants to know what network interfaces are there and ask it via UDEV, it will say there are no devices, go away. And of course that breaks some network management software instead of containers. However, we of course need to make that work, right? Like for the VerdexE. So I would expect that soonishly we will have some support for VATH simply so that we can make sure that discovery of network devices works properly in containers. It's easier for us to test. I mean what people should understand is that testing things with EnSwan is super easy because you can actually GDP, you can S trace from the host into the container and there's nothing better than that if you're actually developing an init system which is normally really hard to debug because you can never, like the startup routine of the init system is incredibly hard to debug because you have nothing else running at that time and they could actually attach to it and supervise it. But you can if you actually use these containers. So for us again, the feature, like the focus of it is really debugging testing, providing building and for that we don't need so much the, except when we actually try to make things work like that. So we can spot the host inside the container. What are the environment variables that the host receives? It's pretty much empty I think but we can actually check that. Like if we, I mean I don't wanna lie so let's check that better. That's just, and let's pick an even different name. I must have been like, I think machine control because the kernel doesn't notify us about container ends. I think it must be about 16 TET machines now. So this is inside of the, like just a shell. And now if we do export or so then we, yeah well now we don't know really what's actually for us and what's not. So this is what's in it. It's a bit hard to read. It's a pass that we set. It's like, this is actually something like, like yeah I think I don't have any more time but this is basically it's an unofficial standard that people agreed, like a couple of container people agreed on how the container manager can inform the container about the fact that it's running inside of a container manager and which container manager that it is. So it's basically an environment variable that says the container equals the system the end spawn where, system the end spawn is basically the identifier. And we set the term because otherwise things would not be useful. I think this comes from the shell though. And yeah. Does that answer, so basically nothing. Was the exception of container and pass. I guess my time's over. So I would take one last question but if there is none then I would thank you very much. If you have any further question that's the one last question. It doesn't work. So this all integrates closely system D and it makes use of a lot of modern system D features. Like for it integrates really nicely with system D like for example you can take the system D end spawn and run it as a service like a service unit and then it can use resource management on the service unit to apply resource limits to the container and all these kind of integration that there is doesn't really make it possible to work outside of it. I mean I'm pretty sure people could port it but I will not support that and it's not a trivial amount of work. But yeah okay then the really, really last question. Now that the end spawn wouldn't it make more sense to have it integrated into system D itself? Like wouldn't I maybe want to run a service not in a container inside a different network? Oh you can do that. So what I mean this is not specific to end spawn now but for system services you can actually specify that there's a Boolean setting that you have on each individual service whether it runs in a private network or not. Right and if it runs in a private network that it's pretty much exact equivalent of the system D end spawn dash of private networks which basically means you only get a loop active as nothing else. It's actually incredibly useful feature because I mean now I'm completely off topic here but if you use socket activation which is a system D feature where you can basically spawn a service depending on network traffic coming in you can combine that with the private networking stuff which basically means that the only way how a container or a service that you start gets access to the network is via that socket activated socket that you passed in and nowhere else and nothing which is an awesome feature like super security fantastic but yeah I don't know it's a different topic I guess talking about that. So anyway this was really the last question if you have any further questions find me outside or talk to us in RC or write me email and thank you very much.