 In this talk, we will explain the concept of disk image builder, we will see some demo talk will be presented by Paul Belanger, who is the full time contributor to OpenStack infrastructure working in RedHack. Thanks. So introduction, hopefully you're here, you want to learn about image builds and particularly the way that we do it in OpenStack Infra and the tooling that we use. Before I start, I want to say thanks to Clint, he really primed the room for me here, he did a lot of good talk about Zool and CI in general. Didn't necessarily dive into the node pool side of things, this talk really relates to it. So while we did not coordinate beforehand, I could not ask for a better person to go in front of me. So who am I? So I'm a senior software developer at RedHack, I am ord under the OpenStack infrastructure engineering team. What does that really mean? Not too much, because I don't actually do too much internal facing things. I have the luxury of working 100% upstream in the OpenStack project, specifically the project infrastructure team. Where I am one of the, I think we're up to about 12 infrared root people, the people that have the keys to all the systems. However, anybody in this room can actually manage all of our systems, all of our systems are open and public, managed via Garrett and so on and so forth. I'm just on part of the team that has to sometimes go and reboot or restart Garrett because it's running out of memory or something. So what we're going to be talking about, we're going to talk a little bit about OpenStack infer in general, just to set you up on basically why we're choosing this image builder and Ansible. And we're going to talk a little bit about node pool because node pool is the scheduler or the piece of software that we use to front end disk image builder. So what is OpenStack infrastructure? Like I said, we're basically the team that manages all the systems behind the OpenStack project. It's done everything from the development, the testing and the collaboration tools. So for example, I'm sure people know of JJB for Jenkins Job Builder. If you're using that, we maintain that as part of our code base and day to day because we use it extensively. Basically, the mandate is everything is open, obviously not passwords or anything like that. Those are stored in a hybrid database, but any of our infrastructure is open with basically the purpose of having other people, other teams, other companies hopefully embrace that way of thinking and even start using our tooling. Like Clint was saying before, Zool and node pool are perfect use cases which grew organically out of the need to drive OpenStack as a project. And the super cool thing is that downstream contributors and downstream companies are embracing those technologies. So more specifically how it relates, this really is OpenStack infrastructure. So you can kind of see we have our Garrett on the left and that's the interface for the developers. These are our fleet of Git servers that we run because Garrett isn't powerful enough to do all of the load balancing of Git requests that we need on our whole right hand side, which is a testing infrastructure. So Zool, we're not going to really talk about that today. We have the concept of a merger and a launcher. So these launch the jobs, these merge the code for the jobs. These are all of our public clouds. The part we're going to talk about today is really this little box in here, which is node pool. Node pool is the piece of software that launches the VM on the cloud. And because we deal with so many clouds, we deal with the 12, I think we're up to 12 clouds now, 12 regions. We can't trust the images in the clouds because they're not uniform across all of our clouds. We started out that way, but then we started having to hack up our public deployment scripts to say, if rack space don't install system D because that image already had system D or something like that. So yeah, so what is node pool? So node pool is a system for launching single unit tests on demand of image builds with cache data. And basically node pool started as a way for us to attach a dynamic slave into Jenkins using the Jenkins API. And then once Jenkins accepted that slave, we would then connect into that slave remotely via SSH to run our tests. And then same thing. When it was done, results come back, node pool would tear it down or will tear it down and start the processes again. So basically there's three components of node pool today. There's the scheduler. I'm not really going to go into that. I just wanted to outline that's the loop that's actually going out and doing all the calculations of where demand lives. We have a concept of a launcher. This technically is still being designed now as part of our Zulu v3 effort. But this will be massively scaled out. Currently we really only support open-stack clouds. But moving forward, first-class citizens were aiming for Google, Amazon, Kubernetes, like anything, bare metal. All of that stuff falls under the new mission statement, I guess, of Zulu node pool. And finally is node pool builder. And that's really what we're going to talk about today. And the reason I put node pool builder in there, it's basically a very thin front end to disk image builder, how we use it. But it gives us the scalability of distributing builds. So node pool gives us the ability to schedule our distributed builds across our infrastructure rather than having a single machine create all of our image. Because there's not enough time in a day to build all of our images on one machine. So why? So why do we do this? So basically it's an ownership of an image build. This is important for us because images get deleted on clouds and there's no backup. And if we don't have that image, we can't test it and we can't reproduce it. It removes the dependency on other build systems. At the scale we deal with, we can't afford an outage of an upstream image being deleted or their build system down. It gets into immutable infrastructure. That's really what disk image builder was created for, the concept of an immutable build. We use it to a degree but other projects in OpenStack, which I'll reference quickly, really embrace disk image builder in that aspect. Obviously security, yay, nay, maybe. But basically it's a potential tool to do a single, a potential tool for single builds for both bare metal images, QCAL images for cloud images and potentially container images. And like I said, it's reproducibility. So very quickly, no pool, some image data stats. We've got 12 regions. We've got six images that we built today. Three Ubuntu, two CentOS, Fedora 25, and a Debian Jesse. These are the formats that we built. So QCAL 2, VHDs are needed for rack space. And then we have raw images because sometimes clouds just need a raw image for their glance back end. And then basically two dedicated builders, no pool builders. And this no pool builder concept is new for us. It's part of the Zool V3 effort and it really allows us to scale this up to, we could have six image builders or 100 image builders or whatever it is. It truly is distributed building now. Okay, so getting into the heart of it now that I've spent nine minutes on that. What is disk image builder? So disk image builder is basically a collection of elements. And these elements at the heart of it are basically written in bash. Now, love it or hate it, that's basically how we have it today. The concept of disk image builder though is it's really a phased approach of building your image. So you have the concept of these phases here. So a root, an extra data, pre-install and install, post-install, block device finalize and clean up. Because we're dealing with CH roots, depending on the phase, it's something that happens outside of the image and something that happens inside the image. The majority of these stages happen inside the image. Two that happen on the outside are obviously the root and the cleanup. And I'll explain a little bit more the way that I've chosen to kind of implement Ansible in the concept here. So some examples that are using disk image builder that may not be using Ansible. So triple O, the triple O project in OpenStack is OpenStack on OpenStack. That's the triple O. It's meant to deploy OpenStack using OpenStack. Initially started by HP, Red Hat and so on and so forth. Primarily now driven by Red Hat. This is the installer of choice for Red Hat. It is basically, again, all immutable infrastructure, image-based deployments, leverages disk image builder. Bifrost is a basically Ansible playbook that does bare metal provisioning, uses the ironic project in OpenStack. But the images that are used to boot the node the first time are generated by disk image builder. And they use Ansible to do the builds, but they don't have Ansible inside the container of the build. And then finally, us node pool for projects. So the OpenStack infra, we have our own elements that define how an image looks for testing. More specifically, in our elements, we actually run puppet inside of the image. This is for legacy reasons, because we are still a puppet shop. By moving to something like Ansible, hopefully, we can remove some of those dirty dependencies that are needed for your configuration management on the inside of an image, and it starts living on the outside. So basically, getting into the heart of it, what's needed to actually do this? So basically, I've wrote a very simple patch, and I call it the simple playbook element. It can't get any more simple than this. And what it does is it's really meant to launch Ansible playbook and define an inventory file to the CH root that has already been created in disk image builder. And the reason for this is it allows us to use a single command to build the image and then overlay our configuration atop of it. Another way to do it would be to create the image, save it, and then uncompress it into a directory and then have Ansible do it. So that, I mean, that's totally valuable, a total reasonable way to do it, but the amount of data and the churn that we're doing, this makes a little bit more sense just to do it all at the first stage. So what does your playbook look like? So very simple. This is a very simple playbook that I have wrote to build a Docker image for disk image builder. So I'm going to use disk image builder to create me a container so I can run disk image builder inside of it. Turtles all the way down, right? But the really cool thing about this is that if you're already an Ansible shop or I might have skipped over it. Does anybody not know what Ansible is? Hopefully everybody's used it, seen it. Okay, great. The benefit of something like this approach in my view is you're not pinning your management of an image to the tooling that's building the image, which means I can take this same playbook and in theory use the SSH connection of it to manage my actual production running node pool service. Today, we can't necessarily do that in OpenStack Infra because we have a different path that our puppet takes when we say if it's an image build, run this puppet manifest, but don't start any of the services in the image because you can't really do that and puppet gets a little bit crazy. So the whole idea is you have an existing playbook that you've written already for your production. Why not use that to create your immutable infrastructure at the same time? So yeah, so the connection route, like I was talking about, is Ansible connection CH route. Now very funny, there isn't really much documentation on this, so it might be kind of a surprise to anybody that you can actually do this in Ansible, but yes, you can. Monty Taylor was gracious enough to tell me about this neat little hack. I guess if you want to call it a hack, but it's not a hack. It actually works fantastic. Instead of SSH. Instead of SSH, yeah, it just puts it out to the existing mount that's been created by DiskImageBuilder and does its thing and then Ansible just stops working and then DiskImageBuilder does the cleanup and so on and so forth. Yes. Oh, I... Yes, I could, but I don't know how to do it. Yeah, unfortunately, I should have blown that up just a little bit more. I'm sorry. When I post these online, I'll make them bigger. I know it doesn't help the staff. Okay, so basically how do you run this command? Oh, I got five minutes, man. I thought I'd have not enough to talk about it. So basically, these are the syntax that we use, what we're setting up here. So these are... These did variables define the type of image that we want. So the top one is for Ubuntu, Tireball, of trustee. The key piece of information is this, the type. So the top one represents Tireball we're creating. The second one represents Kukau image. The third one also represents Kukau image, but it's a different flavor. But it's the same playbook. The beautiful thing is it's the same playbook. So if your playbook in Ansible is written smart enough, it's a single point of entry, right? Node pool builder. Yeah, so this is what it would look like in node pool. So to run this manually, this is the commands that you use. The cool thing about node pool builder is we can express it now in YAML format, a nice clean format, which opensack infra loves. Everything we do is YAML based because it's perfect for code reviews, right? So these are basically environmental variables inside of this image builder that represent, you know, dead-mean mirrors and so on. These elements are the elements that we want to install. So we want an Ubuntu minimal, which is a very basic Ubuntu image of VM things. It does some stuff with the kernels. Simple init is basically a replacement that we run in opensack infra for cloud init. And then some repos that we cache and so on and so forth. And then a fancy element that says some infra packages that we need that may not be in disk image builder. So from node pool's point of view, what does this look like? So we have a fancy command called dib image list. This is yesterday. These are all the images that we've created. So you can see we have... By default, we keep two images in the cloud, which gives us a primary and a fallback. If we upload a bad image, we can immediately roll back and not have to do a rebuild. We've got three going on here because we only delete the image once everything has been uploaded. And you can see on Debbie and Jesse, it's currently building. So like I said, the cool thing is this is totally distributed across multi-machines for image builds. It's backed by ZooKeeper. Yay or nay if you like ZooKeeper. But these are the actual images what they look like in the far end. So these are the images that we're uploading to the clouds. So you can see I couldn't fit it all in because I would have had to scroll like five pages because we have massive amount of images. So just here we got... I don't even know what that is. 20 images across all these clouds. So these are rack space clouds and regions. OVH, OSIG, you'll see multiple OSIGs because it's the same cloud, but different hardware profiles. Bluebox. And then for cloud, we have our own cloud and infrastructure now. Like the OpenStack infrastructure team is running a cloud on donated hardware. What else? So basically docker. So you could do this for docker. Really cool thing, docker import. So you're just padding the target ball so you don't import. And then you're running your privileged container thing. Kind of issue with this. I haven't cracked the metadata issue because in a docker file you express your entry points and firewall rules and so on. I'm not really a container-ish, but apparently there's a method to generate this JSON blob that represents this. I don't know how to do it yet. So if anybody knows how to do it, point me to the person. Thank you. I'll come talk to you. And finally I just wanted to say isn't there a little bit of overlap with Ansible Container? If nobody has anybody not heard of Ansible Container? So Ansible Container is an effort by the Ansible team to create kind of this concept. I've not really dove into it. I don't want to talk too much about it, but the main difference between that from what I understand is Ansible Container is leveraging a top of docker build infrastructure. So they're not doing the raw image build. They're not expecting an image to already be created and then overlaying their playbooks. Again, this approach that I'm discussing is end-to-end, we own everything and build it and so on and so forth. And I have, oh, right on time. At a time, questions? What's your question? These, obviously, places and locations of everything for both Ansible and Disk Image Builder. Yes. So Fedora, okay. So with Fedora we use it's Disk Image Builder and we have an element to use Yum Downloader and we express what the minimum set of packages are. So when you went back and seen the minimal element here for Ubuntu, Disk Image Builder expresses what that minimal image is. Now it's a static representation in the process of trying to merge some of this stuff upstream. We had kind of a discussion, a heated discussion of, well, Disk Image Builder wants to express this is the default because this is what the default is today. But in a container, you might not need Yum or you might not need any of this. So that's one of the, I don't wanna say pain points but that's one of the struggles as if you're going to go this route, you may have to create your own representation of a base Fedora that says I want Yum and something else and that's it. Right. Yes. So that's part of the things that we as an upstream project didn't really, we didn't support Fedora until very recently and the only reason we got Fedora is because Ian in Australia basically said we need something for Fedora and I'm just gonna throw some code into this and it magically works. We try to, Ubuntu is a little bit different, we use Deb Bootstrap, minimal build, right. Again we express we need, on the minimal image we need Python 2 or Python 3 or something like that but you're right, it's not exact duplication of upstream build methods and I wouldn't say you don't wanna do that. This is for you creating your images that you need for specific purposes and our purpose is testing, right. How you get kind of a nice separation line. This is safe stuff, this is not safe stuff. So what I should, what I was gonna do while we were all talking is actually get this process going after I made sure my actual builds worked. This is, there we go. So basically that, this is what happens in the background is the boring stuff. You can see if you go to nb01opensack.org these are the actual log outputs so after I was reading you can go and see how we do this. Sadly this isn't using Ansible but the one that I had on my laptop running is using Ansible so basically when we get back to it like it's running Ansible now it's kicked into Ansible and now it's doing Ansible things via this CH root connection which like I said the whole thing I was trying to express one minute, last question. Stop talking. Okay, hopefully you found it informative and yeah, thank you.