 Right, okay, so I'm told that it's time to start so Disc image builder is one of the things we've built out as part of working on triple O which is deploying OpenStack on OpenStack and I think it's sort of generally applicable for other people who want to deploy things with heat and perhaps with other other systems But since you can make a talk about any one of those I'm just going to talk about disk image builder in the context of heat deployed applications I'm a distinguished technologist at HP Converged Cloud I'm the triple O PTL and I recently got voted on to the OpenStack technical committee so All of this to say I really have no idea what I'm talking about and I hope you guys can help fix up any preconceptions as we We go through this talk This is you know the standard model talking about OpenStack You've got your applications on top of a lovely compute cloud and I have to use this slide in all of my talks I've used it in so many now that if I took it out It'd be a betrayal of this this continuity When we started working on triple O the basic concept was to say hey What if all of the components of your OpenStack deployment were just applications? so everything you do to make deploying applications in an OpenStack cloud better will make deploying OpenStack itself easier and better and more for more straightforward and To figure out You know what are our needs if these things are apps? What what do we need is a deployment infrastructure for this sort of application? It needs to be a repeatable process that goes without saying we don't want to deploy OpenStack by hand You need to be able to pull it without the internet if you're bringing up a data center You don't have the internet until you've brought up your firewalls which are part of what you're deploying And we need to deploy to bare metal. It's not something that heat itself knows how to do and And what and this is a scale thing right if you test one thing and then deploy another You're setting yourself up for a huge face plant at some point because what you've tested worked and what you deployed didn't and That becomes a real problem if new features are coming into your repository whether it's good or a package repository Faster than the time it takes to test and deploy Because that means you either have to start locking and not letting things in while you test and deploy Or by the time you finish the test run and ready to deploy What you deploy from that repository would now be different Obviously it needs to scale up and down and I don't mean deploy a cloud and then make it bigger and smaller I mean that we need to be able to deploy really big clouds but you also need to be able to deploy a little dev test environment with a couple of machines in the corner and And We also had this interesting problem that suddenly things that often in the cloud you might consume as API's like databases Suddenly become an application that you have to deploy and make it act like a cloudy thing So it has to be able to be thrown on to hardware with no sort of manual preparation and When we redeploy an image onto a machine we need to preserve the important persistent data the database Swift data stores our cluster affairs or CEP back-end volumes and This final thing was if P triple I was to be a success It had to be something that existing deployment communities could at Minimum interoperate with ideally it'll be something that everyone can consolidate round to have one set of tooling So we started saying okay, so how do we? Evolve on to what we need so heat obvious no brainer. It's the open stack service orchestration there Nova beer metal was coming along. We said we'll jump on that. We'll work on it. It gives us instances on physical machines Then we said, okay, so how we've got disconnect here Nova does disc images everything else in the universe deploying in terms of repeatable automation wants to run with packages so How do you get the sort of repeatable? Can roll back if you need characters it's out of a chef and puppet environment The answer that multiple organizations have come up with is you end up with a single custom package repository per commit that you're going to push through your chef or puppet infrastructure and you end up with that because Having a single lock held on a common app repository is actually a really bad idea because you end up this big backlog and you can't work on multiple problems at once so you want to fork it and That's a problem It's a bunch of work. It's complex and it's hard for people to really just quickly look at it and see what's going on So we said we don't really want to buy into that. That's going to be more Then we want when we bring it up cloud so we said golden images disk image builder, I mean, that's what I'm really here to talk about and Oh Wow, ah, yes, sorry. I just Miss read my own notes for a second. So the installation phase building a disk image that takes place with internet access We can pull packages down from package repositories. We can pull stuff out of git we can pull stuff out of pi pi and Once it's built into an image the deploy never needs internet access So unless you have a licensing server or something for some proprietary thing You'll have all the software already present on the disk image and Then final configuration takes place when you deploy it Things that are always going to be the same in your environment. You might bake into the configuration on the image itself things like File paths for where database is going to be are going to be pretty constant Whereas how much of your cache how much of your RAM goes to a database case you might auto configure that by looking at how much RAM is on the hardware you've been deployed to and And for folk that then want to use Jeff and puppet for their their main system administration Just bake that into the image as well And it can do configuration management and no longer needs to do software installation And CFN in it can happily deploy things By which I mean you run that as part of your Nova EC2 metadata and it can do arbitrary initialization code as Part of early boot so you boot up It comes up cloud and it hands off to CFN and it CFN and it says hey From the metadata this is where your puppet master is asked for a certificate Get hooked up and after that puppet can take over and do the the administration of your machine But we had this problem in triple O again that we we needed something that was going to be We didn't want to exclude half of the deployers in the world that we're using the the other choice of chef or puppet In what we put in upstream open stack So we did a minimal configuration management system It doesn't know how to do installation all it knows how to do is to collect data from a metadata source like EC2 or heat or potentially puppet master or chef server it knows how to Trigger some scripts to refresh the configuration on local machine and also apply config knows how to write templates and I've probably got extra slides for that so for added redundancy The point of the separation though is that you can take out the refresh config in the apply config and put puppet or chef in there and have that triggered from us collect config and Everything should still just work so Disc image builder builds an image by going through a set of hooks We create a root image. We do some pre-installation. We do some installation post-install finalization and so on each image build is parameterized by including elements so You can have an element for a particular piece of software you want to install or you can have an element for Bringing in some plugins for something so they can compose. They don't they should never replace each other and Each element includes one or more hook files in each of those sets of hooks are referred to before so the Ubuntu element For example has a script 10 case Ubuntu in root dot D Which is how it drags the data out of an Ubuntu reference cloud image and puts it into a case for you for disk image builder to to build things Elements can depend on other elements so you can create an element that just says I want to bring in these different packages and These other elements and that's my definition. I'm always going to use I don't have to always pass the same parameters and every time One of the things that configuration management systems do that we lose out by having our own thin layer Is that they provide abstractions between different operating systems and sometimes the difference is things between like apt and Yum different tooling to do the same basic job and other times. It's things like Different defaults for where files are and so on We don't have a really good answer for this other than if you're doing a Large variety of different software you might want to go down the route of bringing in puppet or chef as part of your your image It's why it's there is an explicit approach But for triple O itself what we've done is we've said hey, this is a common problem Let's just write a minimal contract and then each operating system we support can supply that contract So we have a script called install packages, which if you run install packages on a fedora image It will use yum if you run it on an app them a nabuntu image It will use apt and we have a translation file there that says these are what packages are named in the different distros So it just works finally For performance most of the things you install with disk image builder get cached in tilde dot image slash image create So the base image from your vendor the operating system Pi pi packages get cached yum packages get cached and this means the repeated image builds are pretty fast We do recommend you have a skid case or something similar as well But it's not strictly needed so With that sort of framework in mind We also create a temporary file system to build the image in so the build is entirely in RAM Of course if you're building on a machine with a low low RAM like for gig or something That's maybe a problem. It can be turned off, but it makes builds much faster We copy the contents of the base image the Ubuntu or fedora file system into the temporary file system We just able service startup because we're not interested in running anything. We just want to get the right files on this We override resolve.conf and proxy settings and you know, this is a moving list as we find things we need to override will override more For the duration of the build We install the software as needed and it's installed in the charoot So software that doesn't play nice with charoots, which occasionally exists does need special handling But I think we've had to do that once so far. So it's not really a high volume problem It's not something you should expect to run into Then we make a sparse raw image with a file system big enough for the contents of the tempfs we move the contents into the raw image and If you're doing a VM, which needs a bootloader we configure a bootloader Restore service startup restore the proxy Pack it down into a q-cow2. This is actually the longest time on most of the straightforward image builds I've got is doing the compression into a q-cow2 file. It's slow And then you're done, you know, we unwind clean up everything makes you everything's unmounted, but that's it That gets you your image a couple of things to note. This is not nova. We can trust these images If someone wants to root you by giving you a bad base image, they can just ship a bad binary They don't need to play monkey games with the file system by shipping you. We create our own file system as we go For a couple reasons one there were some really strange images They had like 10 gig file systems that we just couldn't unpack and develop as machines in a sort of reasonable time frame But also it gives people who want to run different file systems in their image the ability to do So if you want to run LVM if you want to run XFS or something It's not a change in the overall code flow. It's just a change for that one part of the one part of the process and Was it Steven Dake that did the heat? Rapper for disk image builder Yeah, so Steven Dake did a a really cool hack. He wrapped all of disk image builder and With with custom elements, so you throw your own elements and as parameters in a heat template So you can spin up a disk image builder VM build an image and it uploads it to glance at the end of it And it's about 18 lines of heat template. So it's actually really easy to use that as a service You don't need to have a dedicated machine. He will just spin up a VM for you in your cloud Do the build and upload it into the glance in that cloud So this is I guess all prelude so far. I've probably been boring people as we can tell by folk leaving So I'm sorry about that More Important, how do you actually use this so you need to export an elements path? You make a directory where you're going to put your elements your customizations for your images and the elements path tells disk image builder how to find your elements And it's a path so that you can reuse elements from multiple organizations and then just add your own in a new directory You don't need to run a fork of other people's repositories If you do need to replace an element with a fixed copy for example because it's a path the first occurrence of any given element name is what's used and Just put your elements first in the path Make a directory give it a read me because everyone needs documentation. I mean really you'll forget what it does if you don't give it a read me so Add any to be elements that you need to depend on to element depths and add any hook directories that you need All of the scripts that I mentioned before are just shell scripts or Python scripts whatever you like and They should do what you would normally do like install some packages tweak some some settings delete garbage that's left by the install process Things like license keys can be a bit interesting if you need your license keys to come dynamically You'll need to put that in your deploy time logic. So as configuration management scripts If you can apply the license image when you build the image, then you should do that as part of your actual direct hooks I didn't want this to become a manual So there's there's a lot more documentation out in the disk image builder repository and you can reach us on the dev list and so on So to create an image One line. This is the general format of it Discommit create dash a what architecture i386 amd64 arm le or whatever dash o and a file name so My alarm might my image and it will get doc q cal 2 appended to it and at the end of the process that file will exist And that's what you can upload to glance and then you provide one more image elements to build So for example, this is how we build a Nova compute node for triple o We take a bunto we this is for the dev test story. So this isn't production production. We make two changes to it We want to bunto we want i386 we want the file to go into triple o root to be called overcloud compute And we want Nova compute Nova KVM Neutron osculic config dash people interfaces and No dutch there is a typo. I'm sorry about that. I'll make sure that's fixed in the copy of the slides I put online It shouldn't be there so One of the things we found when people come along and they start contributing is they don't know what elements to use So they they read the docs and say hey, yeah, I want to do this thing But there's a whole bunch of stuff that can make your life easier so Ubuntu fedora or red hat Rel and there's a suzer element coming gives you that's the operating system. You're going to build We don't do windows builds today. There's some discussion going on about how we can make that happen. There's some interest in it I'm certainly very open to to come up with a good solution for that Although using wine in this does kind of scare me a little So the VM element is interesting we Historical reasons and perhaps we should change this but our default is to build just a single partition image an image of one file system Rather than of a full disk and that's what Zen wants because it wants detached kernel ram disk Image and it's what Nova be a metal once but it's not what most clouds want Which is a image with a VM in the boot block and they hand off through the Bioslayer rather than directly booting your kernel So most people will want the VM element to be included the source repositories elements this is Nice because you can you can just write hey get clone something in your rules, but if you use the source repository elements, it will make a case of it until the dot case image create and Rather than pulling the whole contents every time it just pulls down the incremental data and then copies that into your image So much much faster If you're building for Dora and you're like me and you really don't understand a sea Linux properly You'll want to put it into permissive mode. So disable a sea Linux Pip case is nice. It's not safe for concurrent builds because of bugs in pip But if you if you turn it on it creates a pip case every element element every image You build will have that case mapped into it at the beginning of the build and the case gets updated by the build so You don't down much download much from Pi Pi or from the OpenStack Pi Pi mirror very very quick If you're building concurrent images, though You need thread safety and then you should use the Pi Pi element which takes an actual local pip mirror and uses that instead and there's a An element that will create that for you or you can use the OpenStack and for a Pi Pi mirror project to build such a mirror DHCP all interfaces is something we hooked up. It just interrogates the Proc interface to find what interfaces you've got which ones have link and then add Stanzas to et cetera network interfaces and brings them up So most reference images only have eth0 in their configuration This lets you run if you've got four neutral networks attached. They'll all come up rather than you having to manually Fiddle the image to do that. Oh Without using file injection. You can use file injection, of course if you want to but I'm in the camp that considers file injection from a cloud to be an abstraction violation and not to be not to be done Pi Pi mirror is an element that uses the Pi Pi mirror project from OpenStack Infra, but it sets up a cron job So this is really useful if you're deploying infrastructure to maintain your infrastructure. This gives you an automatically maintained mirror It's a little meta, but I think it's still still worthwhile knowing at worst You can look at its code to see how to run your own mirror by hand Oscillate config refresh config and apply config. I touched on before they aren't needed but We think they're better than sliced bread and there's been some discussion with the heat guys about actually getting Oscillate config to replace CFN tools in the or at least CFN in it in the sort of recommended path for working with heat I'll supply config is really it's just moustache templates in a file system It writes them out to the files. They're going to end up on really really super easy for getting the exact config you want Finally and this is really Useful for be able to get onto an image-based upgrade process Use ephemeral this lets you map stateful files into stash mount stash state And if you're using a cinder block device, and you're not using boot from volume Then all you put all your state into the cinder block device Which is available from different hypervisors after your instance crashes or the hypervisor goes offline And If you're on no the bare metal using ephemeral partition to get the same basic feature set caveats around this If you're using boot from volume you still need to have a separate volume for state or when you deploy a new golden image To your volume you'll wipe out your persistent state. So you still need to keep the separation Now see I so when I spoke earlier. I said hey what you deploy Has to be what you tested this is really the point where I think golden images start to really shine the rest of it It's just it's just swings around about different ways to get in the software on the machine, right a Golden image encapsulates a full set of software So if you test a golden image in your CI system and you take that same image you can deploy that and it's an atomic unit So you get away from this whole problem of having to have multiple app repositories with different points in time or Any of the other sources shenanigans you can play to try and make that work like like locking so You upload a change to Garrett Jenkins builds an image for you You upload that image to glance and then you deploy a test cluster with heat you Upgrade the cluster to your new image and you do that to make sure that any data migration logic you've got Works and you run a functional test against the cluster You make sure your your tests that the cluster is up and alive are working properly now for different clusters that might take different forms It might be that you use just your monitoring system and then a load tester or you might actually have specific tests something like tempest if Tests pass great deployed approach You've done every single check you're going to do you now know that that image is ready to deploy You've got nothing else to do to get it ready for deployment Or if they don't pass go back and upload a new change to fix whatever failed so I've probably timed the talk wrong is my feeling but No, maybe not So this gives us a repeatable automated end-to-end process For deploying your application for deploying it without race conditions. You can deploy it without internet access So if you're in a secured environment Great Not everyone is but as I said we developed this for triple O where we did have that constraint You can deploy it to be a metal if you're deploying Hadoop Savannah uses disk image builders disk images for example built with this image builder so you can deploy workloads and be a metal and Build your images during your CI don't build them as something after you've run your tests build them to run your tests on Not your unit tests your functional tests All your scaling logic for scaling up and down is kept in heat and here is one of the key key points every single node in the cluster is identical all of your Pro say say you deploy in Swift you've got Swift proxy and The Swift object store all your proxy nodes are now running exactly the same software the same kernel the same Swift version Every last little detail without you having to do any additional work. It's just an emergent property from this approach Persistent data. It's a bit of a work in progress. We want to make it nicer than I've described so far, but that's really I don't have an exact answer on how we'll make it nicer yet and For the integration with chef and puppet that I mentioned red hat have a proof of concept with with puppet So it's doable. We probably need to do some work to make it nicer and easier to do without changing any of the puppet rules at all so I We've got 10 minutes for four questions, or we can just go out to the parties demo Steve Did you prepare one earlier? right, so Steve Baker prepared one and I went to to run it against our Triple O test cloud and I found a configuration problem in our test cloud at you know an hour before I had to fly so No, I'm sorry. I Can certainly show you building an image from that by can't you it actually deploying for perhaps grab me later in the week I'll hopefully have that configuration thing fixed Then again, it's gonna be a busy week. So Something that comes. Oh, I'm sorry Hello, could you speak a little bit to the decision to treat the incoming base image as something that is not your concern And is that how is that working out? Is it the assumption that it always comes from the vendor? How does how do issues of trust play into that? Etc. Okay, so I guess I should start in that entire bit over we We don't want to run we don't want to be in the business of figuring out how to run operating system installers and Figuring out how to inject things into them. So that's sort of the first thing that that's a problem the operating system vendors already have themselves And because we want to be multi-vendor. We don't want to sort of have to figure that out again and again and again Secondly, I Don't think it's any more trustee trust worthy to take an image that event is created or to create take the binary packages A vendor is created plus an installer scripts and run the installer script If we can't trust the vendor to run their installer script themselves properly that's that's a real problem and and What we can be sure of though is that by taking an image that I've built and that they think has read run in a cloud is We're taking the configuration they recommend for running in a cloud We're just going to tweak that to put our software on it to run in a cloud So I think this is actually better than running the vendor installer because it's exactly we don't need to change our configuration When the vendor install says hello, we're going to add in I Don't know maybe we'll turn off tweak the VM right back settings in the image by default for VMs We'll inherit that automatically unless we actively go out of a way to disable it Whereas if we just ran their installer, which is designed to Install it a bare metal We'd have to go and copy that change that they're doing to keep up to date. So less maintenance for us But there's another aspect which is that if you want to edit an image you build so you can build an image and then you Can say well, we want really fast iterations We don't want to wait three minutes for an image build We want it in 15 seconds then that's doable with this skimmage builder take the Image you want to have as an iterative thing tweak it so that you don't keep it in a tab or you keep it unpacked on disk copy that in Turn off compression when you build the output There's absolutely no way you're gonna run a full operating system installer in that sort of time frame and Another point is that I consider is that Ubuntu's latest cloud installer actually does does a disk image copy So it's down to byte copy style thing So they're actually in the way from having users run package management based installers as well So there's I think we've kind of led the way on a change that I expect to propagate out Across the interest people go for more performance I think that's really I think that covers your question. Oh You ask how it's working out it's working out fantastic they build fast They're reliable the problems we have so far is that fedora seems to have some flaky mirrors that we find because we build a lot of images and DNS around Robin or DNS geo location, whatever it is every now and then picks a bad mirror and we have a failed build Yeah, so I think that's not something we should fix That that's something to say upstream. Hey, if you want people to use your mirrors, they need to be reliable Right. So you're having performance problems building images disk image builder Okay So I think we look at about three minutes for a Nullop build at the moment with compression And I think it's like 90 seconds to unpack do blah with this guy. Oh and copy it background So I'd like to make that a bit faster, but it's tolerable if you if that's not what you're seeing then there may be some other factors such as a package endpoints repo locations and and the syslinux Invocation for the master boot record manipulation so we get a lot of profiling data out from The run D thing the timings so you should be able to look at those timings and see where the time is going and then Either file a bug or or dig into it three Two One thank you