 I put it down a little bit. So there's this piano music, so no one's really ever heard of it, I guess, I guess so, right? So my grandmother used to play this piano music for me, that's not her, she has much older hands, and she used to play this piano music and no one had ever heard of it and when she died I kept the sheet music and I tried to see if I could find a copy or something and I couldn't. So I actually put in, I scanned in the sheet music and I put a copy on Wikipedia and six months later people started actually posting copies of the sheet music, different versions, on eBay. And so I think what actually happens is people search on eBay and say, does this on Google, excuse me, or Duck Duck or whatever, people search on the internet and see if they find the music and if so then they put it up for sale. So after six months all these people started putting this music on the internet and I started buying up all these copies. So I'm now actually the world's largest collector of this weird piece of piano music. So in case you want a copy, just ping me. Yeah, who's the minutes person in the back? I also brought big stuff like that just to wake you up. Is everyone awake? All right, I guess I'll start. People are like, what, didn't expect this. Just for ever doing the video feed, most of the stuff can happen on the screen in a terminal. So you probably don't want to see me. You probably want to see the screen and I might sit down a bit. So just to type. So actually this desk is pretty cool. I think there's an up and down button for the desk somewhere. So cool. It's like a spaceship. Whenever you're ready, I'm happy to start. Cool. So apologies everyone. I have a bit of a cough, but I'm very excited to be here. My name is James. I go by Purple Idea on the internet. I work for Red Hat. We'll talk about that shortly because this is DevConf. And I just want to really thank the organizers for having me. I'm like my first DevConf and really want to reach out to the DBN community. And so thanks for having me here. Who am I? I'm a hacker. I'm a config management architect at Red Hat. I have no idea what that means, but I hack on shit. I'm an engineering, not a marketing guy, so I'm not trying to sell you anything. There is no product today. And I write a technical blog. Who's read my technical blog or seen it before? Just raise your hand. If you haven't, just raise your hand anyway, so I'll know you look really popular. I'm actually a physiologist by training. And I don't know what it is with all these non-computer people hacking on computers, but it's true. So if you want to talk about cardiology or Venus return or advanced stuff, please let me know. And I'm a big fan of DevOps. So if you've seen my blog or seen some of my past work, you might be familiar with some of my puppet hacks. Yeah, you can keep the audio up. I have random sounds. This is beaker screaming because everything is on fire. So I started hacking on Puppet kind of a while ago. I think I got pretty good at it. And I wanted to build some pretty advanced things. So I started writing some really outrageous code. I started showing this off, I think, around 2013. Did you know you can actually do recursion in Puppet? Raise your hand if you knew this. You don't want to do this. This is not good. The codes at the bottom, if you really find out why. This was just a like, can I do recursion in Puppet? Turns out you can. Sometimes you actually want, and I know this sounds wrong, but in fact, you actually mathematically need to run Puppet again. And you preferably want to do it sooner. So you can actually do this crazy thing, which I invented, the codes again at the bottom, where you decide if you want Puppet to run again. And you actually exec a Python program which double forks and watches the parent Puppet process till it ends and waits a number of minutes and runs again. Is this nice? No, this is pretty nasty. You can build timers to do similar sorts of things. So you might want to say, start up a DRBD cluster, syncing, wait an hour, and then change the resync rate for something for production. So you can do weird stuff like this, kind of nasty. And you can actually even build finite state machines in Puppet. I swear I actually did this. Please check the code. But I really don't recommend this. So the real question is, is this the right way to build advanced, complicated things? Come on, wake up, or I'll throw a fire at you. OK, so this guy has the answer. Can you see the screen OK? He has the nope, and he's just like, nope. This is my nope guy. And yeah, all is noped. All right, so eventually I had to think about this. And I thought about all of the designs and things. And I sat down, and I said, I just need to write something new. And unfortunately, I'm really bad at naming, but I'm calling it MGMT. It turns out there's some weird band called MGMT, which I'm not really into, I'm more of a hip hop guy. But if you search for MGMT config, it's much more findable. Just a really quick thing that I wanted to just say, because I heard the occasional, most people are being excellent here. And actually, everything's great. But I want to talk about this Red Hat versus Debian thing really briefly. So there's really no Red Hat versus Debian versus canonical thing. This is mostly internet garbage from trolls and people that are trying to divide free software. And the real thing that I want you all to remember is it's really Red Hat and Debian. Everyone else is writing free software against proprietary software. And at least that's the way I see it. We're like 9,000 employees, so I'm sure there's different views. I'm personally a huge fan of the Debian project. I tried to find out when I first started using Debian, but it was a really long time ago. I don't remember. I don't always run Debian. I mean, my laptop's running Fedora. But nonetheless, it's very important. And my project, I've specifically designed it to be feature complete and have feature parody on both Fedora and Debian from day one. So there's no, like, here's the initial Fedora stuff. And if you want it to work on Debian, send the patches. Everything that goes in works on both, minus maybe the odd bug or something. But if so, let me know. And free software is actually, it's important thing for us at this conference. And I really believe that config management is critical for that. Because config management is what makes the software usable, especially on servers and also desktops and also secure. So if you don't have the energy to manage your servers properly, you can have just bugs and people pawning your stuff just because of laziness. So if you have good automation tools, I think this helps mitigate this. Make sense? Yeah? You guys are so sleeping. Who's shy? Just raise your hand if you're really shy. I won't pick on you. All right, good. Excellent. So I'm going to actually get pretty technical, but I'm just going to give you a little intro of the tool. So if you're not familiar with config management, last one, who's familiar with the existing sort of config management? And who's not? Who's not? OK, so it's pretty almost zero. The video guys I think may be playing, but yeah. So basically in config management, typically you have some sort of graph, so a resource graph that expresses some dependencies between the resources. And they run and so on. My tool is a bit different, and I'm going to show you the main three differences. So the first thing is that it runs the whole resource graph in parallel. This is different from normal tools. The second thing is event driven. I'm going to tell you what that means shortly. And the third one is it runs as a distributed topology. So distributed systems and PAXOS and RAF and these sort of things are actually quite possible and quite advantageous today. So we're going to show you how that works in a moment. So just the first thing. So this is basically how a resource graph in Puppet or MGMT would look. And if you look here, can you see this OK on the screen with the red arrows? So the blue blobs each represent a resource, say a package to be installed, a service to start or stop, a file to set up, and so on. And the black arrows are the resources. Yes? Come on. Be with me so I know. If you're lost, let me know. There'll be a bit of time at questions, but don't be shy if you're really, really lost. And actually, so what current tools actually do is they look at this graph, and they use something called a topological sort, which is basically this red arrow. It just says, OK, I'm going to do one, two, three, four, then five, kind of arbitrary, six, and then seven. But in fact, we could actually do this. So if you look at this whole, the left side of the graph and the right side of graph can actually run in parallel together, because there's no dependencies, right? A lot of money, yeah, right, getting better, warming up. It's a little chilly in South Africa, but we're getting there. But in fact, also this left side here, if you look, once 1A is finished, you can run these two both in parallel. And then this one will wait for them both to finish and then run. Cool? So this is sort of possible. Do you want to see a demo? Yes. All right, let's see a demo. OK, so find the demo. OK, so what I'm going to do, I'm going to show you this example. So this is just, say, a package that takes 10 seconds to install. This is just simulated. Some service thing that takes 10 seconds to start off or do something. And some other command that takes 10 seconds. And this guy over here takes 15 seconds to run. So if we run this, how long should it take if it's running in parallel? 30 seconds, exactly. If it wasn't running in parallel, it would take longer. So we'll go here. Is that big enough for you to see? OK, so we're going to actually just, we're going to time this. So I've just compiled a fresh version from Git. Oops. Turn that down. So this is basically the thing. It's going to run. It's going to start up some back end stuff that works. And it's all contained. And just to time this, we're actually going to run this converge timeout option equals 5. So what's going to happen is it's going to run. And once the graph is in a converge state, that means everything's done. It will time up to five seconds. And after five seconds of being convergable, quit. So the whole run should take about exactly. So let's run this. Boom, so it starts up some stuff. And then right here at the bottom, you can see here's that first one running. It's running this check apply. And that 15 second one is over here. So 10 seconds go by. Boom. It finishes right here. And the second thing starts running. Five seconds later, that second parallel one right is running, if you can see at the bottom. Five more seconds later, the second thing ends. And that last one, that third thing at the bottom of the graph starts up. Eight, nine, 10 seconds are up. That finishes. One, two, three, four, five. Nothing's happened. The whole thing ends. And you can see the whole tool run in about 36 seconds. So there's really, really very little overhead to the software. I've added a whole bunch of new features. So it used to run in like 35.0 something. But now there's more shit going on. So really very little overhead. Did anyone completely miss what just happened? Don't be shy. Let me know. Do you like this? Is this a good idea? Why didn't we do this before? I don't know. I thought this was, and if you really didn't want to run something in parallel, there's nothing that says you can't have a walk that says only run up to so many operations at a certain time. But yeah, so that's the basic thing. You can do some complex graphs like this. If you want to see a demo of this or other ones at the bottom, we can show at the end. But I want to move on, okay? So the second aspect is the event-based nature. So what we actually do, we have a nice picture. If you think about how normal systems run, they start up, like say, Puppet, or starts up, it runs through the whole graph, checks, applies everything, and 30 minutes later, it starts again, right? Goes through the whole thing, 30 minutes later, again. So you're wasting resources over and over. And what happens if something changes on your system or you want to make a change in between that 30 minutes when it's sleeping? What's gonna happen? Nothing, you're not gonna know until you re-hit and it runs as, oh, now I'm paying attention again. So in MGMT, we actually do something different. We actually start up, we run, we go through everything, but we actually take a watch on each resource that we're doing. So for example, for files, I think I have some examples here. For files, we actually take an Inotify watch. For services, we look at system, the events, and so on. So we actually watch that resource and the second it changes, boom, we fix that resource. And because we run in parallel, we can only, we don't need to fix the parts of the graph that need to be fixing. And for packages, for example, we use package kit events. And that's actually one of the ways and one of the reasons we use package kit because it gives us events on if someone changes the package state, watching the RPMDB and Debian file databases. And also it's fully compatible with Debian and Fedora and so on. So you wanna see a demo? All right, let's see a demo, this demo's cool. All right, so what I'm gonna do, I'm gonna show you a little example. So right now the DSL for this language doesn't exist, so I just have a raw graph just to show you. So I have three files, slash temp, mgmt, f1, f2, f3 that I wanna create. And each one has contents, imf1, imf2, imf3, makes sense? I also have one more file here, imf4, and it has the state absence. So f4 should not be present, but the other three should and have a certain defined contents, all right? So we're just gonna run the tool. Okay, so I'm gonna run this here on the left, but just so you can see what's happening, I'm gonna go here and make this directory just so you can see there's nothing in it. These are where the files are gonna appear. So I'm gonna run this on the left and boom. So we go here and before I can do anything else, there's three files here, cool? Makes sense? We can actually cat them and see that in fact the contents are real, but we can actually even just remove f2 and alas and boom, it's already back, right? So you can actually remove f2 and it's back, but that's kind of annoying. So we can actually remove f2 and cat f2 and boom. I mean, have you ever seen the command work this way? So on the right, we're gonna just mess with the system and on the left, I don't know if you noticed, but the system is actually noticing and responding. But this is kind of manual and I know you all love automation and scripting, so I can actually watch dash n0.1 and this is just a command that's gonna run something over and over again as fast as possible and as fast as I run it on the right, it's always noticing and oops, killed my mic and fixing it on the left. There is a question, yes. Microphone, the question was, isn't that super scary? Well, yeah, you like to change your question. That's fair. If you get it wrong, you can't undo it because it's something listening and just... Absolutely, so this is a feature which I think is beneficial for a lot of reasons. This is just one and some more advanced resources will make use of this too, but if you don't like this instant sort of thing, you can run this in puppet mode if you want. You can have it run, start up, converge and 30 minutes later, run in chron again. I'm not putting that one, I'm just saying, I'm not putting a small delay in there for like dry run or test before... So, obviously, yeah, those are things that you can add of course, but this is just showing you the best sort of fast mode. And this is actually gonna be quite important for later in the talk, so, yeah, another quick question. Everyone just suddenly said, whoa, this is possible and got all scared. It seems to me like if you get it wrong, then it's just you have to fix it a different way. Instead of fixing it by removing the file, you fix it by changing the YAML file. Didn't quite understand the question. So he's saying it's worrying that if you get it wrong, you can't just remove it and get rid of it completely, but you can still change the input... Yes, yes, yes, obviously. So actually, in fact, when you update your config, the tool actually can notice automatically that you have new config and fix it right away. So it's very event-based, it's very dynamic. It turns out this is actually very useful for building elaborate systems, and I'm gonna get into that a bit later. Again, if you don't want this, then you're left with the status quo of config management, which was kind of bleak to me, which is wait 30 minutes and then you'll have the same situation. Yeah, so the key thing in config management, which is gonna make config management useful and a current tool that we depend on, is has to be very, very safe. So if you are writing your config management code in some sort of language that lets you have off-by-one errors and things like that, then you're completely gonna blow away your stuff and erase it. So this tool, again, you're jumping far, far ahead into the presentation, but the idea is that we have a very, very safe thing so that a compile time, if something's gonna go wrong, you're gonna know then, okay? So if you want, we can talk more. I gotta move on. I'll show you, I mean, you can just last little note here. You can also do things like, hey, Debian, this is cool, and F2 and had F2, that sort of thing, and it still works, right? So if you really wanna mess with your sysadmins, oh, this mic keeps falling off. There's lots of stuff you can do. And lastly, just if you touch F4 and file F4, same sort of thing, right? It'll remove that file. It will, that you added before you had a chance. Cool? Quick questions. Anyone else? Good? You're itching for a question. Go for it. The principle of least surprise would have me make the file immutable in my file system rather than relying on it being rewritten. Yeah. If you wanted to do that, if you wanted to set in the file permissions that it has no write access, you could do that. But that's a choice that you have to make if you're configuring that file. Your software might want it to be writable or something. So if you wanted to say root has no access to write this file or something like that, you could. But again, this is a config management tool. We don't force our users to say you must make your files not writable. All right. So just a quick question. So this, I'm coming here to tell you that this is what I see as next generation config management. But does this feel like a different technology to anyone? No, just scream it out. Anyone? Some sort of system decomment? No. I actually see this, if you think about it and not in the whole scope, but I actually see this as monitoring. So it's assumed that you had actually pretty advanced resources. And again, this doesn't encompass all of monitoring. But if you had pretty advanced resources, you could actually think that all of the state of a particular thing that you're managing is actually built in that same resource. So we never have to have this wall between doing the config management and then getting it monitored and then putting into production. We could put it all in one sort of holistic sense. So just something to think about. Next part is that third distributed topology, the thing that I was talking about. So this is just a simple client server topology. You have a bunch of clients in one server. And what software uses this, for example, in config management space? Louder? You got it, you're right, but louder. Don't be shy. So she said puppet, right? So you have puppet, chef, these sorts of things. They have a server, bunch of clients. What's the problem with this topology? Server's gone, everything's gone. Server's gone, everything's gone, bad. What's another problem? Doesn't scale very well. Let's look at a different topology. This one looks quite similar, but the arrows are pointing the other way. This is what I call an orchestrator topology. So when you say orchestrator, I consider that to be a centralized orchestrator. What technologies use this? M-collective. Oh boy, now you're getting into it. What? M-collective. M-collective, sure, someone else said Ansible. But again, what's the problem with this solution? Pardon me? I didn't hear you. Aaron's? Aaron's, you're right, you're right. Events, we don't have events, that's true. But again, some other people had said it. It still doesn't scale. Same things as the first problem, right? It's still a single point of failure. It's still useful topology for certain things, but in general, there's still a lot of problems with it. So we thought about doing something like this, where every peer connects to every other peer, and what's the obvious problem here? It doesn't scale, right? So if N becomes 1,000 machines, or I mean, even a small number of machines, it's just crazy. So in fact, what we do is something actually like this. So you have as many machines as, well, that you can afford, I hope. And what we actually do is we temporarily elect certain ones to be sort of primary machines in the cluster, and on top of this, we, oops, we build a distributed key value store. Now we use etcd and the RAF protocol to do this, and I'm gonna show you how this works. So this will actually let everyone be able to talk through one of these masters, and if one of them goes away, we can reelect someone. So why do we wanna do this? Now we use this distributed key value store to allow all of the members of this cluster to pass information back and forth, but again, in indirect way. So just to illustrate this, I'm gonna show you a quick example. I'm gonna make three hosts, A, B, and C, and each of these hosts is gonna create one file. It's gonna create one file on itself as part of its resource graph, and it's gonna store another of those files, not on itself, it's not gonna create that file, it's gonna put it into this distributed key value store. This is sort of what Puppet would approximately have as an exported resource, except they don't work very well on Puppet. And so each one has that file on itself, and it puts one up, but the other thing it's gonna do is it's gonna look in that graph, match a certain pattern, and pull everything that matches that pattern down. So this will sort of vaguely look like this. You each have a file, you put it up, and you pull down everything. So how many files will everyone have on them after this runs? Yeah, six, not six. So three, so you're gonna get three from those other ones plus that initial one from yourself, so that's four total. Now let's actually go through this one at a time and actually see how this works. You wanna see a demo? All right, let's see a demo. Let me just kill this, I'm gonna sit. I don't do a lot of hacking standing up. I know there's some people that are into like, oops. Okay. Ah, okay, so I'm just gonna make, in this case, four directories. So I'm just making four directories and I'm just watching these so you can see what's actually happening live. These are each gonna represent one host, okay? Now on the left here, I'm gonna actually just start up one of these engines. So we're gonna time, we don't really need to time it, but it's just gonna run continuously. We run file, okay, and we need the hostname. So this is, you can use this hostname flag just to simulate multiple machines on that same host, but so we run this up and very quickly, boom, you have two files on the first machine. That's because you put one file on yourself and one file up into this database and then right after you pull everything down that's in that database onto yourself. Therefore, two files. Make sense? Yeah? Cool, good. So the second one we're gonna do is a little bit bigger. Similar thing. Now just to show you how this is working, forget the IP addresses which are just a fact that we have to specify them when we're on the same machine. But all we do is we actually just point the second machine at any machine in the cluster. So in this case we point, we're pointing it to the first machine and that's how they cluster together automatically. So this one's gonna put one file on itself and put one file up in this database and then pull down everything that's there. So how many files is it gonna have? Exactly, it's gonna have three, but because this is all event-based and awesome and magic, that first machine is gonna notice that there's a new file in that database and it's gonna also pull it that down onto itself. So now the first machine and the second one are both gonna have three files, okay? So we'll run it, see how fast it happens. Boom, right away it's all done, cool? Do you wanna do a third machine? Or let's do a third machine. I got all day, or not all day, but till lunch. So again, same thing, we just point this at any machine in the cluster. So now same thing's gonna happen, one on itself, pull down everything, push it up. How many files is it gonna have? Exactly, and the other two? Also four. You guys are pros. Can you help me write this software, please? You're all good at this. All right, so we're gonna run this, boom, four, and the other ones update very quickly right away, cool? So just to show you, we're actually gonna just run this, oops, ETC, we're just gonna run this XED command line tool to actually query the status of this cluster. And if you see, we actually can see that there's three members in the cluster, H1, H2, and H3, okay? They're all the primary members, at CD calls it masters in the cluster. And they're all running this, the cluster, and they've all done the work to exchange everything. So what we could do is we could start up a fourth machine. Now I've told the cluster to have, ideally, three servers elected as primaries. So when we add a fourth one, okay, this should be our fourth one, okay? You'll see that quickly you have now five files on each one, okay? And if we look at the cluster, you'll see that there's still those three machines that are the masters. But now we can actually go and kill one, okay? Let's say it goes on fire because you had one of those room heaters in your server room. So we kill this one, it shuts down, and then right away, this one here is gonna notice and I'll just run this command here again. You actually can see that it noticed that there was something wrong and the cluster using safe consensus algorithms decided F4, you're now gonna be the new server in this cluster. So it took over automatically. Now you don't just get forcefully chosen to be a server in this cluster. If you don't wanna be a server, you don't have to. What actually happens is there's a negotiation protocol where you first volunteer and you're available for volunteering and so on. And if you wanna bow out, you can and so on. But this is basically how this elastic clustering works. Yes, we have a question, go ahead. Microphone. Oh, microphone. What happens if one host says delete this file and the other host says create this file? Right, so the key thing is that each host has its own little engine running on itself. So that machine only can say make choices about itself. You cannot forcefully make another machine do something. What you can do, however, is you can expose a certain resource saying I would like this file to be deleted. And then other hosts will look at data and they have their own code that says what sort of things they would like to pull down. And then that machine which pulls it down and says please do this will then do that action that it consented to pulling down. Does that make sense? If, I'll re-explain, I got one nod, okay? So each machine only manages the stuff on themselves. So if I want to make this machine over here delete a file, I could say here's a file pattern that makes that file gone. When he pulls it down, he's gonna choose to pull down a certain pattern of things. And if when he pulls that down, he has a conflict between what he wants to do and the combination of what was found in that rules, then it's compiler error. So these are things which wouldn't make sense, right? You can't say create the file and delete the file. So all of those are compiler errors and it just won't work. But it's a great question. Same thing in Puppet actually. So it's not actually too novel. Just if you say Puppet insured file created and insured file deleted, it'll just be a compiler error. Yeah, any other quick questions? Yeah, go ahead. Can you actually reliably detect that kind of conflict when you compile a rule set if it depends on data that is not there yet at compile time? Yeah, good question. So the truth is that the DSL is not written yet. But the short story is you cannot detect anything, right? So in general, if you have weird data which in some forms causes problems, it will be a compiler error. But that's usually indicative of a programming problem. And when this happens, it won't do anything to the machine. So it's a safe error condition and then you can check your code and fix the problems. So there's no guarantee in any config management language that you'll always get what you want. But the closest thing you'll get to is when there's something that's wrong, nothing will break. It will just say, hey, there's a problem, please investigate. Correct. Nope, so the question was, can you have file names depend on data? Yes, but the point is it's impossible to guarantee that your code is always gonna work depending on any data you throw in. That depends on if you write your code properly and we're building the language to make it very difficult to write something that leads to incorrect or incomprehensible code. I mean, this is just physics and there's nothing new that we can do there to prevent that. If you know of something, let me know. Yeah, another question. What's the transport layer like? For the, between the machines? Yeah, how do they communicate? Great question. This is actually using SCD. So actually, the SCD code is actually merged into this project, which is also using Golang. SCD V3 is what we're using now, it just got released and we've been using it in beta for a while and that's actually using HTTP2, which is GRPC. So it turns out it's quite efficient, yeah. You need an additional port open on all the machines and you need an additional PKI. Yes, so the PKI stuff I'm not gonna talk about today because it's really out of scope for what I wanna cover. You definitely need a port open. If you really don't wanna have machines communicating to each other, there's actually a way we're gonna do this with a push mode, kind of similar to Ansible where you just SSH in and so on. But yes, fundamentally if you wanna have a clustered system, you cannot have machines operating in bubbles. So you have to make that choice. Do you want to have things work across machines or do you wanna have just a bunch of separate independently managed machines? It's just the main question is why didn't you, or have you looked at SSH, reusing SSH that already exists? SSH transporting all of this data through SSH is absolutely gonna happen and will happen probably through a centralized machine if you wanna do it that way. So if you want to tunnel all of the SCD traffic through one machine, that will be something that's coming but that's not available on Git right now. Last question is anyone? Yes. Did you do some scaling tests? Not at all. I've done some private, very unscientific scaling tests and I'll show you some performance stuff on other parts later but for the SCD stuff, it's pretty well documented in the SCD upstream. I think they're saying 1,000 or 10,000 hosts now. I mean it's quite large and if you really have clusters that are super large then at some point if you hit a scaling problem please let me know. But this isn't even production ready yet. So if you're interested in large scales and you have the hardware to run this on, ping me. Yeah, another question. I saw more stuff. Yeah, you say that you need ETC, the version three, blah, blah, blah. If I use ETCD for something else, will you like Gopnik, can you use an ETC demo that I already have in my infrastructure? I think I understand your question but if I don't answer what you're expecting please let me know. So you can do this two ways. You can use if you want an existing SCD cluster or you can use the SCD cluster which we will build for you and manage for you. So if you really, really are crazy about having your own SCD cluster, no problem you just point everything to that existing SCD cluster and things will work as normal. You won't get the auto scaling of SCD members or anything like that but you will have a working cluster. And as for if you're using an existing cluster there's a namespace you can set so that MGMT won't set anything that doesn't have that same prefix. Last question, we can do questions more at the end but there's a few more things I want to show you if you want to see them, up to you. You want to see more or are you fed up? More. You want to see more? More? More, okay, cool. So yeah, this was just the example that I showed you. We kill one machine, another one will take over right away. So again, all this code is still work in progress. So there are some little issues and things to improve but if you want to help please don't be shy. Here's another quick little feature. This doesn't demo as well but I think you'll get the idea and I will show you if you want. So it turns out that when we write manifest and code we have to sometimes put these dependencies between things. So we want the package to get installed first and then we want to set up the config file and then we want to start the service. But it turns out we can actually figure out the dependencies for a lot of these things automatically. So if you have package data, which you do, you know which services are going to be involved so you can say if this package uses that service always make sure this is installed before you start it. Make sense? So we can actually do this and I'll actually just show you a quick, quick demo. Let me just kill all this stuff quickly. What? Oops, kill that. Okay. So close this, all these windows. So just really quickly. So I have, oops. My fingers. I have examples, auto edges three. So this is just a simple file. You have a DRVD package, a config file, the config file directory, and the service that are all in here. And we're just going to, I don't need to time it. And by the way, all of the stuff I'm showing this is all in Gitmaster. So you can check it out and do the same examples. So this is auto edges three. We're just gonna run this. My password is password if you wanna know. So this is gonna end, but just to show you. So when it actually builds the graph, the engine's actually quite clever and it will add the dependencies that it knows makes sense. So in this case, this package to service dependency and this package to file dependency, these edges, these dependencies are automatically added. Someone is asking a question, which is usually, but I don't want this auto edges. So yes, you can disable it if you don't want. It's an optional feature. Yes, quick question, because I'm almost out of time. Is it a bug that it didn't add an edge between the config file and starting the service? No, it's not. It's not a bug because we cannot actually detect that. But in the future, SystemD will probably be able to know that certain config files are related. So if there's a way you can detect it logically, tell me and we'll add the patch, okay? Ask me later. You can check which package also config file. I understand your question, but I believe there's no way to detect this right now. If you know of a way, talk to me after and we'll add it in the code, or you can add it in the code. So anything that we can automatically detect, we can add an edge for automatically. There's an API to do this. So if I have missed something and you know of a way, please let me know. I want to show you another thing. This is actually more interesting. So this is just some shitty graph of three file, three packages. So we've got Powertop, SL, and Calsay, my favorite packages. Well, these two are. Files and a service. And when you run Puppet or some existing tools and you run this graph, what's gonna happen? I guess versus the package installation. What's gonna happen? Anyone know? Has anyone done this? If you watch your configs running, it turns out it's gonna run apt-get basically one, two, three different times. It's gonna go through startup, overhead, check cache, so on, three separate times. Same thing with the um and so on. So this is hugely wasteful. So we actually can do this feature called automatic grouping. And what it actually does is it analyzes the graph, looks at the dependencies, and in this case says, ah, I can redraw this graph to have these three bubbles overlapped on top of each other. And that way you can complete all of that package installation in this case in a single step without having to wait over and over again. You wanna see a demo? Let's do a demo. Oops, show you the, so I'm just gonna sit here. So on the, whoops. Where's my terminal? Someone got scared. Don't worry, I won't keep you from dinner. He has to go to the bathroom. All right. I have to go too. Okay, so we're gonna remove, we're gonna remove Powertop, SL, and Cal, say password. All right, good. Oops. Remove, I'm just gonna remove. Did I, did I do that wrong? Ah, good. Oh, I have a little, little typo. Remove, move, Cal, say, SL, Powertop. Ah, okay, so these packages are not installed. Is that what I typed wrong? Cal, say. Okay, these are, I don't think these are installed. Yeah, they're not installed. It's just unhappy. All right, good. So what we're gonna do is I'm gonna run this software. This great software which I wrote for you. File, Examples, Package. Ah, sorry, I have this group. Okay, so we're gonna run this. Let's hope the internet works. Starts up and boom, right away in the single transaction, it's installing these three packages. So internet is going quite efficient, quite fast module internet. And a few seconds later, I think this should be done. Hey, Cal, say it works, right? But, and you can actually even, you wanna see a Cal, say trick? You can say Cal, say, Cal, say, hey, DBN, and do like crazy stuff. But, no more fun time. We're gonna actually remove pkcon, remove Cal, say, I'm sorry. But, we run this, and on the right, we've removed it on the left. If you notice, it's already saying nope, and it's reinstalling it. So Cal, say, is back, right? So everything is good, and same thing, event-based, noticing that something's wrong, and putting it back. Just to like, you can actually do this auto-grouping for other resources. Package is just the most obvious example where there's obvious performance benefits. But there's a whole bunch of resources we haven't even written that will be able to use this API. And so on. This is just the poke fun of other software. So shorter is better, shorter means less time. And if you look over on the right, the left to the right, it's basically installing one package versus three packages. And the more packages we install, the difference in these bars gets bigger. So that's these red bars or puppet, and the really small bars are this tool, and the package manager is just running raw. And so, yeah, if you had five packages, this gets even bigger, and you waste more and more time if you happen to be doing packages. There's similar benefits for other sorts of things. Bigger is worse. Always good to remember that. There's a cool little thing. So on the first time I gave this talk, I actually pointed out that someone could write a compiler to take existing puppet code and run it on this engine. Because if you have to rewrite all your puppet code or Chef code or whatever, this would suck. And some brave soul, who is now a good friend of mine, said this is awesome, I wanna write this, because you need to know puppet internals, and I don't know puppet internals. And so he actually wrote this. And this is beautiful. I mean, you can try this out. Again, everything here is all free software. So it's not finished. It's not perfect, but it's pretty close. And you can actually run directly your existing puppet code on this engine. So you get all the benefits of parallelism and so on. Again, not all resources or not as many resources as we like exist in MGMT yet, but it's a good start and all the plumbing is there to add new resources. So if you wanna get involved, try this out. If you were to just take your existing puppet code and just run this in production for the first time and expect it to work the exact same way, you're a little bit crazy, but it does exist. And you should check this out. I know there's some questions. I'm just gonna finish up and then you can ask questions. So just what's coming in the future? So again, there's no DSL right now. I need some help working on this. I have some design. So if you really are into languages and safe things and declarative functional reactive programming, maybe, please give me a shout. That's work in progress. We need to write more resources. So we can write powerful resources. We can write a resource for our virtual machine and have a declaratively managed virtual machine on your system, right? Think about that. We can have a timer resource. Actually, this is about to get merged. You can have a network resource that actually makes sense. And because these things are all distributed systems, you can change the network without breaking your puppet run. If anyone's ever done that, you know what I mean. Lots of things, a push node. And this is really a community tool. So this is about you. It's not a product. This is just a project that I'm running. So how can you help? This is about you. You need to do work. How can you help? You can use this, test it, patch it, share it, document it, star it, blog it, tweet it, discuss it, hack on it. You guys are hackers. You know how to do the thing. Hack this, right? This is code for you. This is one marketing slide because Red Hat gave me some money to come here, which is really nice of them. So buy their shit if you'd like to. And again, this is an upstream project. It's a community thing. It's not a product. It's just a project. So please get involved. Let's just recap. Answer. Now let me recap. Have you seen this guy? He recaps at 10 at the end of his slide. Wow. Yeah. Here are some friendly links for you. There's the technical blog of James, which I know you all read and love. If you want to put me on Planet Debian, I would love that. There's the project on GitHub. So I'm purpleidea slash mgmt. You can find it. You can search through the blog. There's now four articles about this. And there's also links on the GitHub page at the bottom with links to at least one other recorded talk and the puppet work, and so on, all that stuff. And on the internet, on Twitter, on IRC, GitHub, Gmail, redhat.com, and so on, I'm purpleidea. So you can ping me on Twitter and be like, this is awesome. I love this. Just a fun little slide for people who like fire and magic. So if you want to harass the Debian committee, oh, yes, take his pills. His alarm went off. Please email in a review. If you like this talk, if you like the session, email me or email the Debian people and let them know so that I can get feedback. And if you want to hang out in IRC, we're about 50 people now in mgmt.config. So please do that. If you have questions, I'm happy to take questions. And thank you very much for listening. Questions? This is on? Yeah. I have one more. How would you depend on machine state like puppet facts, essentially? So the question is, how would you depend on machine state like facts? So facts will come with the language. So the language isn't ready. So there's no concept of facts right now. And I don't want to get into the design. But I do think it makes sense. And if you really want to hack on this, ping me. And we'll talk about it. And you can have a say in how it gets built. Any other questions? Don't be shy. Yeah. Gentleman over here in the center. How do you handle failure? Could you say that again a lot? How do you handle failure? How do you handle failure? That's a really good question. So actually, to be completely honest, the advanced error handling code has not been written yet. I plan to write that probably in the next week or so. The way you actually will handle failure right now is it will just keep you trying. Excuse me. But the better code, which I will write, is there will be metaparameters to say, allow up to this many failures, no more than this many per second, and stuff like that. So failures happen, and you'll be able to control what happens when they do and how often you allow them before you have a permanent failure. So yeah, great question. Any other questions? Yes, gentleman in the back. Can it update itself? Can it update itself? Great question. There's only one version right now, which is Gitmaster. So yeah. So I mean, there's no, this is like pre 1.0, so there's no real API stability or anything like that. There isn't really much that should be a major problem in updating itself, like you just have a new package and stuff goes on. What will be a problem is if the DSL were to change and so on, so again, pre 1.0, I wouldn't really worry about this. Or if there's a problem with the update process, some of the damage keep on working and some of the others stop working or they're conflict in some way. So that's a good question. And so that actually comes down to, the question just to restate a different way is, what do you do which you have clusters running different versions of the software at the same time? And even being a little bit more specific, what do you do if you want to roll out a new configuration across your cluster and how do you handle that? So that actually, that code also doesn't exist yet. And there will be probably something called like an execution plan or so on. And I'm not sure what the best way will be in practice, but at least initially when we're hacking on this, there will be probably more than one way to do it. So one possibility will be that you push a new config into your cluster and it waits for a synchronization that everyone has converged and stopped running the current config before you switch over. Or it could be that for certain resources, you don't care and you just want them to run right away. So whether this is a meta-param or sort of global state flag, it really is to be decided. But if you want to help on this, please ping me or come and channel, we'll have a good talk about it. Got time for one more question or two more questions? Don't be shy. Who's shy? Who are those shy people? We have a question from the gentleman. One question here, quick. You're shy. Yeah. Oh, you're shy, okay. Question? Yeah, grab. You're shy also. Okay, so we have people advertising that they're shy but they don't have questions. Okay, then I... So I guess that's it. Oh, you have a question. Sure, I wasn't sure, sorry. So many times you mentioned Puppet. Yes. But how about Ansible? It's still evolving. It still has a lot of capabilities. In fact, I think they now introduced this declarative for virtual machines, this virtual install. So actually, I don't know about the latest developments in Ansible. I do believe, I agree with you. It's a great tool and it's very powerful. And that's what you should use today if you are happy with that tool. In fact, Red Hat bought Ansible. So like, should definitely use it. But my tool is not something that's ready for production. This is a future technology to try and solve some very, very difficult problems which might turn out to be needed, might turn out to need solutions. So would I run this in production in the next year? Probably not. But we're trying to build it and see and we think it's kind of interesting. And maybe one day it will be something that you'll wanna migrate to or so on. But for now, use your stack that you're comfortable with. Puppet, Ansible, bash grips, it's up to you. So yeah, I think there are a time, I'll be around for I think three or four more days of the conference. I leave on the ninth. So if you have questions, if you wanna get your patch merged, if you wanna do a little hacking session to learn how to write a brand new resource, just ping me, don't be shy. And I have some random stickers from my travels in a bag if you wanna get some free stickers. Not MGMT stickers, unfortunately, but yeah. Thank you very much.