 All right, it's 11 o'clock, go ahead and get started. So hello, hope you had a good kind of free day and enjoyed Montreal. My name is Josh Powers. I work at Canonical on the Ubuntu server team, and as a part of the Ubuntu server team, we maintain Cloud Init. And so what I'm here today to talk to you about is, of course, Cloud Init. And so before we begin, what I like to do is kind of talk or show hands. If you have just heard of Cloud Init, maybe you've never even heard of Cloud Init before, and you've never used it or you don't really know how to use it, would you throw up a hand? Okay, all right, good. If you know something about Cloud Init, like you write a cloud confectioner, you know what user data is, may I raise your hand? Okay, and then if you're a contributor, if you've contributed to Cloud Init, all right, one person, two people, there we go. So my presentation, hopefully, will be helpful for everybody in the room. First half of it, I'm gonna spend talking about what Cloud Init is, how it works, how you use it. And the second part, I'll talk more about kind of recent developments we have in Cloud Init and things that are coming in the future. So Cloud Init, Cloud Instance Initialization Software, right? You can see where you get Cloud Init from. The point of Cloud Init is in attempts to automate mastering an image without rebooting, all right? So if you think about all the times that you've set up a system, you launch it, you apt-get update, apt-get upgrade, you install a bunch of packages, maybe you start mounting things, you have to reboot to get things to mount properly maybe, or you want to set up networking and you get your networking all messed up, you have to reboot again, the point of Cloud Init is to try to do that all in one shot, right? Customizing an image, our own kind of marketing material around it would say giving it a personality and doing it all in one shot. So what does this look like inside of a cloud? Before that, I just wanted to point out that even Debian's own cloud images use Cloud Init, right? It is used all over the place and we'll see more examples of this here in a second. So what does this look like in terms of the cloud, right? What is the cloud? It's somebody else's computer, but what's actually going on when you're launching an instance? If you, the happy user, generally when you go to a cloud instance, you choose your image or your AMI if you're on AWS and you have this option to do user data if you've even seen that before. And between those two things is your inputs into creating an instance, right? Every instance in the major clouds runs Cloud Init and it takes eventually all four of these things and produces your instance for you. So images, we know lots about those, right? Those are your operating system, whether it's Stretch or Jesse with a set of predefined packages, hopefully it's small so it's fast and it's easy to move around. User data is what you're giving it in. It's your inputs into customizing that image, right? And then metadata and vendor data come from the cloud host and we'll talk about those a little bit more in a second. But the beauty of Cloud Init is that it's used across various cloud hosts and it's used across all sorts of cloud images, right? So no matter where you're launching images whether it be AWS, Azure, or any other kind of cloud or even using something like LexD which I'll demonstrate here in a bit, you have the same kind of experience with using Cloud Init and you can still use the same kind of cloud config across those hosts or even across to OSs. This is a list of just a number of them, not all of them obviously, but where Cloud Init is used, right? Like I said, already all the major clouds use it. LexD uses it, Mass uses it. So again, across all the cloud platforms it's heavily used and so it is kind of a de facto standard in terms of getting that cloud instance up and running. So user data, let's talk about that. That comes from you and then I think I have, here we go. Here's an example of one type of example of user data. So as I said, it comes from you. It's your way of customizing a image. There are many different formats that user data can come in, right? Probably the most simple way would be a shell script, right? Something that starts with a shebang and you go and you customize your instance and you can do whatever you want. So you can pass it a whole bash script that would say, do this, go do these, run these commands, mount these things, you can do it all for you. Another popular way of doing this is by passing on cloud config. And so Cloud Init has these different modules and this is just a gamble, right? It's a standard gamble syntax of saying, I wanna set my time zone to Toronto, right? We're on these coasts right now. I want it so stamed to be debconf, right? And basically it'll go through these keys and values and based off these keys, go do the operation for you. It will know how to do that. So on a Debian host, it will know what command to type to set the time zone to America, Toronto. Or it will know how to set the host name and it'll do it for you rather than you having to go figure out what all the commands are yourself. So if you actually go through this example, I mentioned time zone host name. We have the ability to import SSH keys, right? In this case, we're importing my SSH key from GitHub directly. We can do the same thing with Launchpad or you can even provide it your public key right in this CML and it will import it in there for you. In this case, I'm setting up a customized apps repository pointing at a PPA. I'm installing a couple packages and the bottom there I'm specifying that I wanna do an update. I wanna do an upgrade to the system as well. So I can have a fully running system, like I said, without doing any reboot. This is a very simple example. You can even go into doing like I alluded to earlier, setting up your disks, mounting them, formatting them, getting them set up in a specific manner that you want. So if on Azure, if you're adding multiple disks and you wanna have one running, you know, EXT4, one running EXT3, you wanna partition this way, the cloud config can handle that sort of detail, as well as networking can go into setting up bridges, setting up VLANs, et cetera. Quickly skipped over metadata and vendor data, but this is an example of metadata taken from AWS a while back. Effectively, metadata is data that comes from the cloud host itself, okay? The cloud provider is providing information to the instance about saying what it is. So if you see here, right, there's an AMI ID, right? The ID of the image itself, a host name for it. The general things that you would see from an AWS instance, if you're familiar with using it, right, the size of it, and type the address, what zone we're putting it in, and things like that. That's metadata, again, from the cloud provider. And then, going back and kind of looping back around, there's vendor data. Vendor data is exactly the same thing as user data. It's just coming from the vendor itself. So then you start to wonder, well, what about precedents? Who wins? User data always wins. So why would you wanna use vendor data? Well, if a vendor wants to make sure that everyone's NTP is set up, rather than having to make sure the users deal with it, the vendor can always say, hey, so the time zone, we know our data center is in the Eastern time zone, always set it to the Eastern time zone. Or maybe the users have it already set up in their preferences. You can go grab that preference and stick it in there for them, so they don't have to worry about it in the first place. Mass, Metal as a Service, uses this, again, kind of as I was alluding to with the NTP, uses it to set up the time zone as an example of how vendor data can be used. So we're gonna try doing a demo. So what I have here is some user data. It's kind of an abbreviation of what I just showed you on that previous slide. Actually, I think it's almost exactly the same, except for I'm setting the password in this case. Setting the password or the default user, making sure it doesn't expire. And so what we can do is maybe you don't wanna go try cloud init on an Amazon instance because you're out of credits and you don't wanna pay for any money. You can actually do this with KVM, right? And so what I'm gonna demonstrate here is using basically the Debian Stretch cloud image. There is a command, where am I going? Called Cloud Locale DS. Basically what you're doing is you're gonna generate a local data source for cloud init to go look at. And the data source is what's gonna tell it to use the data and the metadata. And so you can see the command right there. And then I showed you the user data and then for the metadata, I just have a simple instance ID giving it a fake instance ID value. All right, so if I generate this disk, which will be used as the data source, that's all it prints out. It created a seed.image file system. And then to boot it, what I'm gonna do, oops. I'm effectively just run QMU system with the image attached to it and then the seed image also attached to it. So we can get that booting. All right, so stretch is booting up. We'll start to see some output from cloud init kind of scream by here. And now you see it's already starting to do an app update. Luckily this is a lot faster than the hotel wifi, so it's not gonna take as long. But again, it's read that user data. The user data said do an update. Now it's gonna start doing an upgrade because there were some packages that needed to be upgraded from this image. It will go grab those and then it will finish doing the rest of the user data. While this is, oh, I'm gonna let this run. So I have one console. So if you remember from the user data, I set the password and it accepted the password. You can see the host name from the command line has been updated. Based off of date, we see that we're in the eastern time zone. Let it keep spitting stuff out. While it's doing that, in var run cloud init is where these files are kept. And so there is a, so now you can see it installing htop and tree. And it just generated SSH keys. And you can see the last comment is cloud init finished. And it took 83 seconds to do an update, upgrade, install some packages, and be complete. What I wanted to show you, sorry, was there's in var run cloud init, there are two kind of important files. One is status. Whoops. It's basically a JSON, just a JSON file. And what it lists is obviously the current status of cloud init. And so if you have a script of cloud init that takes many minutes where you're mounting lots of images, you're formatting them, you're installing lots of packages, this is a way of checking on where is cloud init at right now. There are four different stages and I'll talk about what each of those mean here in a second. But init locally can see when they start and finish. And at the bottom, it will say what stage you're on at the current time. And so in a later demo, I'll have this printing out repeatedly so you can watch it kind of go through the different stages. The second file is important is result. Result gets printed out at the very end, let me do this. When once cloud init is completely done. And any errors that you run across will be printed out in this file. And so in this case, rather than having an empty array, errors actually ran across something. SSH import ID failed and that's because the Debian image doesn't actually have SSH import ID by default. So that command failed. So this is one place to come look for things. Another important area is in var log. There are two cloud init files there. The first is cloud init output. And this just prints out the generic output that cloud init produces. You'll see all the SSH keys that are generated. In this case, if there's any other kind of errors as well as, you know, here's all that stuff that was getting printed in the terminal, right? All the apt update and apt upgrades. And this one we see running module SSH import ID and it failed. But it doesn't give a whole lot of detail as far as debugging information. Just what ran and when, right? The other file, cloud init.log is the actual debugging log. And in here is where you'll find each module printing out when it starts, when it ends, results, if there are any stack traces, this is where it will be. So more for your awareness that those two files exist. So we now have a Debian instance. It's been customized with my metadata and I didn't have to pay anybody's cloud to try this out. Oops. Second demo I wanted to show was with LexD. If you, and we see the LexD presentation by Stefan yesterday. All right, so LexD is a really cool way to quickly get system containers up and running. And with cloud init, right? Again, if you wanna have things go quickly and be able to go through some, iterate through maybe a cloud config or some user data without going into the cloud, you can do that with LexD. What I can do is I have a little script here, it's a little setup, it's gonna create, I'm gonna use an Ubuntu image real quick because this has cloud init already installed and ready to go. What it basically just did was it took my user data that I had there and injected it into the configuration for this container, right? So there's a LexD config option for user.user-data. And again, you can dump in a shell script, you can dump in a cloud config here. And if cloud init is installed, it will recognize that, oh, I'm running LexD, I'm using the LexD data source. Let me pull the user data from there and let me go run with that, all right? And just to kind of show you the command, it's really simple. LexD config set, the name of your container, user.user-data, and then just pass in that user data. If you start it up, what this is gonna do is it's gonna start the container and it's gonna go, and if you remember that JSON file that I had with status, okay, now we're in the modules config stage. We've already passed init local. Every few seconds, this should update once status changes. And so we can watch how it goes through the different modules. So what are these different modules? init is the first one, and what it's doing is as soon as slash comes up, read right, it's gonna try to block the boot, all right? What we're doing there is trying to set up any networking. So if you pass any networking in to, as your user data, it's gonna try to set up any of that. And then again, like I said, that could be VLANs, bridges, or it could just be a standard, E0, DHCP, or whatever it's called. Then once that is up and networking is up, the second phase gets kicked in there. And after that, we're gonna go check to see if we have any network data sources we need to get to. That's particularly important on the actual clouds if you're running there, and you're gonna set up any block devices or file systems, right? So again, very early in boot, we'd have at least networking going. We haven't really set up or configured any other part of the system, but we are, again, trying to get the system set up without having to reboot over and over and over again, right? One boot that's configured networking storage and all your other configuration items that you're interested in. Modules in final are the last two stages, and you saw that modules config is actually where a lot of the time was spent. Now it's because there was a bit of an update going on, that's where apps was trying to go out to the network. So most of the modules are actually run. And then the final, module slash final, is kind of like an RC local, right? It's the end of the line for Cloudinit. Any final things that need to happen are gonna occur there. And in this case, you see it also printed out the results JSON. In this case, there's no errors, so we have just an empty array. So our Lexity is set up, everything's working, and Cloudinit did this job. So again, these are the commands that I used to basically generate in the first example of KVM, create a seed data source, and boot it using KVM. So really, again, kind of a cool way of testing things out. And then this was the example I used with Lexity containers. And then I just briefly went over this. Again, this is the boot sequence for Cloudinit, starting with init local for networking, blocking the boot, init doing storage, modules config, most of the storage happens there, and then modules final, any kind of final things that need to occur. I point this out because in the versions, sorry, not the versions, the words that are in brackets, that's what the log will show. So if you're going through a log and you're seeing modules config, what is that, you know you're at that third service. All right, so another demo. One of the things that people have concerns with obviously is boot time, right? Everyone doesn't want to sit there and wait, right? With advancements in hardware and, you know, very, very fast storage now. The last thing you want is to be sitting there waiting for your system to boot. Well, if you pass a very large cloud config to Cloudinit, it's kind of interesting to see where is it spending its time. And so one of the advancements over the last year that one of the Cloudinit developers, Ryan Harper, worked on is this thing called basically Cloudinit Analyze. And the goal of this was kind of to show a very similar to SystemD Analyze or SystemD Blame to get an idea of where is Cloudinit spending its time. Where in the modules is it spending its time doing a boot? So this is an example output of it. If you've ever seen SystemD Analyze, Blame should look very similar to that. Where, you know, almost 30 seconds we're spending it in apt configuration. Well, okay, that can make sense, right? One of the nice things about this is if we can come in through here and we can see, we saw how locale generation was taking a long time. And after a while it's like, you know, maybe we should be generating all these locales. Maybe we only need to be generating one unless the user says, please generate this one. And so trying to shorten down those times. And so I have another demo. Let's see here, I think I have one up, right? I do. So I have a container set up already. It's been booted a couple times and I have the, there are clouded it. Analyze branch here, this is not where I wanted to be. Clouded it, Analyze, Blame. All right, so here's an example using kind of the similar user data that I did last time. 50 seconds to run apps. Package update and upgrade took 23 seconds. And then you can see everything else took less than a second, but it does add up after a while. And so getting this data and being able to see it and record it, not only for each boot record, right? This is for the first boot, but then we come down here and we see the second boot record. Okay, this time clouded and it took far less time because everything's already been run, right? We already have all the packages, everything's already been updated, excuse me. I guess I ran it three times. All right, another command. This is Blame and then there's, I think show. All right, so what Blame was saying, which modules are taking the longest time for each boot record, show is going to give you more of a timeline view. So again, I mentioned the four stages. So init local, right away, quickly found, oh, I'm getting my data sources from the no cloud data source. I don't have to check any other data sources. Move on to the next stage. No networking configuration there required, so just move on. init network, right? It's applying some of the initial data source and data source, so user data and vendor data quickly goes through this. And so again, like this is a timeline as you go through the different modules. There's all the config modules that it ran. Here's all the final modules that ran. And then, okay, it set the times, and this is a bug right now after working on. Because it set the time, it skewed everything, so we need to fix that. Boot record two, similar thing, right? So again, purpose of this is to gather data on the different clouds, gather it on different data sources and see opportunities in areas where cloud init can be making improvements to speed up your boot time even further so that we're not slowing you down. So yeah, you can see by the third boot, right? It's taking less than a second for cloud init to run because everything else has already been run. So this is blame, again, showing top to bottom. This is show. The next improvement kind that we've made over the last year is cloud ID. We have a number of data sources. You've heard me use this term a couple times. It's cloud init's method of determining where it's being run, right? So I mentioned there's all those clouds out there that are using cloud init. Well, each of those clouds will run slightly differently, right? With Azure, you might get this ephemeral disk or AWS, you might not get that. The cloud providers have worked with the cloud init project to create their own data source for each of them. And so when you boot an instance that's running cloud init, it has to go figure out, where am I running? Should I be running the EC2 data source or should I be going off and looking for a no cloud or am I running on likes D? It needs to figure that out. And that can take time, especially if you're not running on anything, like you're running on KVM, it might time out waiting for some of these things. And so cloud ID is giving us a way to positively identify what cloud we're running on. And the result of that is faster boot time. We're not going out trying to sniff network resources that aren't actually there or shouldn't be there. And so just a better overall user experience in terms of quickly getting to boot, quickly finding out where I am and moving on from there. The other big area that we've been working on over the last year has been the integration test framework. Again, we're supported across all sorts of clouds. We run across Debian, Ubuntu, Red Hat, SUSE. And so because it's been adopted, it's been adopted by so many different data sources and operating systems, the test matrix is potentially huge. And so first off, what we've done is with merge requests, we are now doing smoke and nightly test runs every day using a daily image and using master. And we're able to more quickly and more readily find stuff earlier and earlier. With the integration test framework, what we've done is given ourselves an ability to go through these scenarios. These are three very common ways that we have to deal with cloud init. First is smoke test and nightly test runs. I kind of alluded to this. Running the daily image, running the latest versions of cloud init was important for smoke testing and gatekeeping, but also in terms of development. If you have a developer who's sitting there hacking on cloud init and they wanna be able to run something, this integration test framework gives us an opportunity for that. And then as far as stable release updates and debugging gives us an opportunity for that. What we've been using is LexD as a back end. LexD, LexD, it's fast. It gives us a big coverage of the cloud configured modules right away. And so it was a really good way of first starting off. We're working on KVM right now as a way of giving us ways of testing network and storage configurations on the remaining cloud config coverage of modules. So what we do is what we take base off of the back end, we start with downloading an image. With LexD, just downloading whatever that image is with KVM, we're downloading a cloud image. We then customize it. At the very least, we're gonna go find out what the cloud init version is in the image. But we can also inject a new version of cloud init. We can also build a version of cloud init like in the developer's tree and inject that in there and customize it. And then we take a snapshot. A snapshot could be like in LexD's case, it is literally a snapshot so that things are a little faster. But in terms of KVM, it's just basically copying an image and saying we're gonna use this image. And then what we do is go into this boot and collect loop where we boot an image with cloud init. We wait for that results file to come through and say everything's done. We check to make sure there's no errors and we start collecting things. We go through and say, oh, we expected the host name to change. We expected this SSH key to be there. And so we go in through and collect the output of all sorts of usually shell commands and gather out the output of things that we expected to chain and things that we expected to be there. And so we do that for all each of the tests. We're gonna boot a configuration. We're gonna collect the results of it and go through this boot collect loop. And then once all that is done, we've got all these results we go through and verify it. And these are just Python unit tests. We're going through and grabbing the output and we're basically running and making sure that the output matches. And so this is kind of how the integration test framework operates and it has given us an ability to kind of really feel better about where things are at with cloud init. In terms of merge requests and releases, so previously there was no CI that was officially done on merge requests. It was all done by hand. And so our goal has been automated better and faster, right? So anytime we submit a merge request to the cloud init project, it runs all the unit tests via talks. It includes some linting. It actually will go out and build the package to make sure you haven't broken the build and then run some of those integration tests that I just mentioned to make sure that the thing major has happened. This has improved both our quality, right? In terms of, oh shoot, I broke things last night. I didn't realize it. That was a reactionary not having to deal with that. Catch it earlier, better. And we've also, as a team, tried to get a lot better in terms of responding to merge requests. The graph is actually the active reviews and the yellow line is the 28 day moving average. It's going the right direction. A lot of things were just kind of sitting there waiting for people to review and, you know, admittedly there were times where we dropped the ball trying to get faster reviews and faster feedback to people over time. As well as releases. The seven, eight release was, last year was first released in a couple of years, I think. We had a seven, nine quickly come after and seven, 10 we're working on right now. So we're really trying to do better in terms of participate with the community, have them get them feedback faster and be able to get things in more quickly. If you're interested in Cloud Init and you want to get involved or you have more questions, run FreeNode and pound Cloud Init. The source is on Launchpad. There is documentation on Read the Docs which is all based off of stuff that's in the source code. And then there's kind of the marketing website for Cloud Init where you can learn a little bit and get links to all these things as well. If you have any questions after this, I'm going to take them and then I'll be here the rest of the day. Thank you. Any questions? Going on? Yeah. End of last year we had the Debian Cloud Sprint. Yes. And we had some problems with Cloud Init. I think for Debian we want to use Cloud Init just for the first time boot to generate the SSH keys, setting host name and so on. And what I get now is that you also, or it's getting bigger and bigger. It's not the initial part anymore since Cloud Config is now something like a configuration management you build. So what do you think? Is there still the separation between the first boot initial things that needs to be done and the configuration management part? So you could run Cloud Init without using any of the configuration stuff, right? You could launch an instance without any user data, which I think most people do, and it will generate the SSH keys for you and be done. And so you can use it in that manner. I'm trying to highlight more of the other things that it can do for you as well, right? In terms of you don't have to be writing shell scripts or bash scripts to go do all this stuff for you, or you can and pass it to Cloud Init and how they go launch it for you and help automate your instance, right? Going back to that first boot, customize everything as much as you want. But if you don't want to use all that, you're fine to not use all that. So does that answer your question? Okay. Any other questions? Cool. Thank you very much. Appreciate it.