 The session is by Andreas, who talked about disk image builder. Andreas, the stage is yours. Okay, thank you. Thank you very much. Thank you for coming. This talk is about disk image builder and creating operating system images with this tool. Before defining what an operating system image is, let's have a spot to the history. Whoops, sorry, that was too much. Once upon a time, operating system images looked like this. Anybody, any idea what it is? Yes, exactly. From Commodore 64, about 30 years old. So this type of data is something like I want to talk about. So this was about 8 kilobytes. And typically these days, the modern operating system are much bigger and stored on disks. But in principle, it's the same data. To send it's about me. So I'm currently involved in mostly three projects. The latest project is my joint user CNC router, which I built these little wooden things there. It turned out that this is mostly a software project. I never thought about it. Most of the time you sit in front of the computer and try to do the cut and the come and get things correct. Then I have the RM2. It's an open source requirements management project, which I'm working on. And since about a year, I'm core developer for OpenStack Disk Image Builder. And this is why I'm here today. My agenda. So I hope I will finish in about 30 minutes. This is really an introduction talk to Disk Image Builder. I picked only two things out, which I want to explain a little bit deeper. But it's mostly really an introduction of what you can do with it. So the green lines, it's really an introduction. And then these things are a little bit more detailed. Oops, sorry. So an operating system image is mostly something like an operating system, especially in one format, maybe as stored on disk as an image, which means as a file somewhere on a disk. Other names are also golden images and templato as images. And typically they are used in environments like cloud images or virtualization platforms. This is one way to build operating system images, which means putting with B-stick or something like a DVD into your computer booted and you will get sooner or later something like a picture like this. And you have to go through the whole setup of installation and image. You have to specify which packages you have to install and so on. I think everybody knows how to, and mostly did it already, I think. Yes. So this is typically a way if you get in your computer, you put this. Then there is another way. This is using a tool which builds your image. So they are typically for each operating system. There comes a tool with you which just puts up an operating system on the disk, mostly as a street structure. There is an example for the bootstrap. So you just enter this thing with the bootstrap, stretch, dot, and then you have two minutes, three minutes later you have a complete Debian system in your directory. So these are, let me say, the historical things. The problem with this is for each operating system, something like this, for each operating system you need a special tool. You need a special configuration. And what you end up, so if you need a couple of them, you end up simply something like this for all of your environment, for all different operating system distributions, architectures, you have one configuration file, one tool, and then you run it and create something like a disk image. And disk image builder tries to solve this problem. It is something like it takes up one, only one configuration and builds really a couple of things, a couple of architectures, a couple of target systems from one configuration on one host system. You have some examples that you can get a feeling for it. So you can just have, for example, to install a Debian minimal system, a Debian minimal or Fedora minimal, send-off minimal and so on, and also what you can do is not only creating virtual machines, but also Docker images from it. What is under the hood is something like really the system, the system tools for this specific operating system are really used. So RPM or APT or DPKGA or something like this. So this is done inside. You as a user of disk image builder, you don't even see this. The support matrix of this, so it's really astonishing how much you can build. So you can build mostly all the mainline things and you can even build different architectures. And this is really done, really can cross to it. So you can have, for example, one MD64 and build something like for a PowerPC, an image from PowerPC. Or you can have an Ubuntu and then you can generate an open zooser for it. So it's really universal. And yes, you can see the bubble over there. This is something like under the hood. So under the hood it's something, it uses QEMO to run the target operating. So the post scripts, the binaries on the host system, which gives you a really complete, for example, complete installed binaries. Here are some environments. So this is which I've really tested. So VMWare, OpenStack, KVM, Amazon, Docker. This is what I tested. There are also possibilities really to use it as a bare metal. So you can use these images, create these images and then put it with some specific method and bare metal and can also use this, yes, this one image or this one based on this one configuration you provide. The disk image builder itself, it's something like a modular system. So what it has is something like elements and for each function or each function block is put into one element. For example, this tibian minimal is one element. VM is one element. The topic master is one element. You can think about hundreds which come directly with this image builder and there are a lot of these elements also in the internet. So for example, if you want to install an image for an Raspberry Pi you can grab it from the internet, specify Raspberry Pi and you add an image for the Raspberry Pi. Some example elements, so really the small set of examples. Yes, you can see. So I don't want to go into detail here. So there are a lot of things you can just put into your VM directly during creation. Some words, what is an element? So an element is mostly something like configuration. So let me start in this way. It resists in one directory and it's a collection of configuration files and directories. And in this directory there live some scripts. So I have here an example or a very small example. An element itself can be dependent on other elements. You can just have a small text file where you can all list the other elements which are also then included. The package installed. So these are the packages which are installed during the creation of the operating system image. And as you see you can just put the name here and also you can have something like different ways of specifying names. And there's also some way for different distributions because the distribution typically name the packages in a different way. Then there is typically some... So there are at least, let me say, there are about six, seven stages during this build process. For example, you just prepare things. Then they are installed. Then the boot loader is installed. Then the cleanup is run and so on. So there are seven different stages of this. And each of these stages are put into a separate directory and the scripts in this directory are then executed at the appropriate stages. So for example in the environment you can set some names which are then picked up by other elements or in the directories themselves there are something like scripts like these things. So this is really where all the work happens. Okay, then I want to talk a little bit about the block device layer because this is an important point. So there are a lot of things you have to really put into the operating system directly. So for example adding a package that can be done later with the configuration management system. So in a golden image. But you won't want, for example, to play around with your partition table with your LVM and so on later on. So this is something like you have to do first directly into the operating system during the build of the image. One year or something like this ago the complete block device layer was about it exists exactly one image and with one partition and that's all. And yes, there was a lot of work done during this last year to implement different ways. So what we did is something like we leveled, we did some levels. So we have something like, for example, one level in this block device layer inside the disk image builder which provides this space. The typical way is to have a loop device which provides local disk but other things are possible but not yet installed. Then there is a way that every one thing this is where mostly the complete work is done. So this is something like mostly combining things what you get from here. So for example you have two loop devices you can put in one partition table on the other LVM and whatsoever you can think everything or what you want and what is important you can put it this level really into itself. So what you can do is, for example, you can have a partition and on top of the partition you can have an LVM and based on this LVM you can have, for example, a crypt setup and so on and so on. Where I can say that the crypt setup is not currently implemented. So you get something like a stack. Yes, the other ways are mostly something like, okay, the other ways are mostly something like typically what needs to be done so you need to do a file system on this that needs to be mounted. It needs to end the FS step and because we have to, of course, path this over to the real system. One example that you can maybe have and think how it is configured but the complete configuration is done by means of a YAML file. It's one YAML file where you can describe the complete hierarchy for the block device layout. And here is one thing, I think it's mostly self-explaning. So you have a local loop there which is, yes, depth loop device handling. Inside this there is a partitioning done. Inside the partitioning is done with a master boot record and yes, there is, for example, one partition which is primary and so on and so on and the whole thing is done. So this is really a very, very simple example to have to get you an impression. So typically if you have something like a logical volume inside the partition table and you have five or six logical volume this gets really lengthy. So everything is possible here. This is just an impression to give you an idea what happens. Yes. Then there was a time where it comes to writing this MBR, this master boot record. These are 72 bytes which has to be at the correct position so I did a DD there that you have something like, so most of them are also zeros. And of course the typical way is look at the internet, grab something which is available and use it and we discussed it really on the development side and we really didn't find anything which we can use because all these tools are either not really thought for scripting or they are, yes, they're doing a lot of things they shouldn't. For example, we had to look at this parted so this is a new parted pro thing. One clash but we could, around this it was licensed because this is only GNU, it's really a GPL, not LGPL so it's a little bit, and the whole disk image builder is under Apache so we have something like maybe a problem there if we would have to use the appropriate libraries there. Calling parted would be possible but parted itself does a lot of things, in my opinion it shouldn't. So it just not only writes these bytes there but also do something like call to kernel for updating the partition tables, it calls Udev, it tries to optimize things and this is really what comes into mind where we really sorted it out, it tries to optimize things. So it looks for example into the kernel which are the buffer sizes of the disk which I want to create and tries to optimize, tries to align them but currently we are on the host system and targeting for a completely different system and we have measurements that using parted gave at some point performance for about only half the performance so you lose about 30 to 40 percent performance if you don't align correctly. So this is why we really implemented this thing as in Python in 150 lines of code so know it there, know it's tested. If you want to write, you'll have something like a master boot record, you can pick it there and you can use it. So then maybe some development insights. This is the OpenStack, this image builder is an OpenStack project, it's a part of the project and from the size it's really small so it's about 1200,000 lines of code and it's about 7,000 lines bash in Python mostly. The new block device layer which was created during the last year is about 3,000 lines of code. What is a little bit problematic what I think is that there are really many, many adoptions so let me say something like this, every third line is something like if you have this system we know that there is this problem and we have to work around this. This is a little bit problematic but for such a tool which aims to be universal you have to do something like this or you cannot just support it. Yes, also the design should be improved, I think we did already do a lot of things there for example this adding the block device layer because before it was something like spread around so each element has something, an own idea about it and now it's really centralized but there are other things which could be optimized. What is sometimes really annoying to me is the slow development cycle so you have your great idea, put it into a garret and nobody cares. This is really something really strange and it takes weeks or months that somebody else picks it up and they will look at it but this is I think in my opinion I'm not really sure about the other OpenStack project but this is for this image builder this is really a problem. We have I think one patch outstanding for AFI boot which is mostly two years old now. Yes, it's really hard. If you have some spare time you're invited to have a look at this so everybody can review it and have their own sort there. The last thing that is a little bit related to this is that all the different maintainers or the different contributors really focus on their own development but they just want to for example feature X and they just really try to put it in and even if it's really not in the right place if it doesn't really fit into it and even if it's really sometimes you get something like a mess so we have just to say no, not this way. Oops, that was one. One recap. So as I said in the beginning this disk image builder one configuration all targets but this is just for an ideal world. As I said there are a lot of problems and there is one matrix build so it's something like an element which in turn builds docker images which in turn then use disk image builder to build all the other things so what you have is something like really something like a building matrix and you don't have to read all the things but just to give you an impression on the top there are the hosts so it starts with center seven db and buster, jc, stretch and so on Fedora opens with it and this is not really finished so it goes on and here there is a target so which ones you try to build and I do this each some weeks to get an impression how stable it is and where is some work. As you see there are a lot of greens so there are a lot of things you can really build but you have to really take care that some just do not work so for example we have there are some problems like it's not here but for example it opens with just to pick one example there is something like that there are some no scripts for example to build up-to-date deviant systems so you just cannot do this because they are just missing something but the problem with the missing script is usually you just need to do a simulink because the bootstrap is using the same file for basically everything you should have and just one simulink it will not run as route anyway I just repeat what you said to get so you said that for two, four, five and six there is only the need to create a softlink to some other already existing version yes you're right I have no idea why this is done in this way the problem is that let me say you as a user just picks the things and wants to execute the program this image builder you can just install on mostly every distribution and you don't want to get root and try to get rid of or add some new simulinks okay some advantages is speed so if you have really set up a good environment for example having an HTTP cache or package cache, RPM, Debian cache the complete build of an operating system which is done typically in two to three minutes if you don't have something like a big thing so if you concentrate on the minimal things you have one configuration for all targets and as I said it supports many, many distributions, architectures and host and target systems the next one it's really something like yes it's as seen that only limited set of functions it's testing during the CI itself so the CI in Garrett in OpenStack it's Zool and it's mostly I think all things are run on the up-to-date Ubuntu version and up-to-date Fedora version and that's mostly all like not really tested during the CI process itself and what is also in disadvantage is that currently so you can build Docker images with this but it's not really optimized for it so when you get images you can use it as Docker images but they are in size not really comparable to something like a minimal system so there are a lot of packages installed that can be removed there so there has to be some work yes to do yes this I want to really to emphasize so what to put into a system and what not this is a little bit of learning of the last some years from my side and the thing is be as general and minimal as possible for the system don't install something like applications which you just need in 10% of your machines that's not needed of course what you want is you have to install for example hardware dependent things so hardware drivers which are needed to build to boot the system but be as minimal as possible the second point is yes I learned it on hardware let me say so don't do hardening this early hardening is a steady process so hardening should be done later on because typically what you do is you have something like a general image and you don't know what you install later on in this for example you have an image and you add for example a postgres database in it where you want to harden it so if you have the image hardened the postgres isn't hardened and also you have to really to to see what you want to do with this yes and when you have to do this so the best practice I see is that you have something like a puppet or a chef or whatsoever configuration management system provided by security there are also some projects on the internet and they should run don't know each day because for example if you update a package the hardening may be overwritten and the latest point is what I already said do the disk layout during operation image build directly so this is a general rule of thumb to think as late as possible okay that's from my side I think I'm in time we have about 10 minutes question and answers time so if you have questions yes sorry I didn't get this okay the question is if there is a plan to have CFS in the file system currently there is no such a plan it's not CFS sorry CFS also no ideas sorry I don't know this project so the question was what's the difference between this project and other projects like PECA and I cannot say that much to it because I have no idea about PECA so it's hard to say yes please we are currently let me say two and a half so there is one guy called iron he's sitting in Australia so it's a little bit hard to synchronize I think there is one yes one additional woman which is something like she really writes something but she's trying to test things so this is something like yes we are only two three which is not that much yes please the target is the question is what is the target from OpenStack? the target from this image builder is it only OpenStack or can you use it as Fedora or Debian something like this this image builder is rolled out with OpenStack but you can install it on mostly any distribution so for example Ubuntu and Debian have their own packages you can just up a ticket and install I'm not really sure about Fedora but what you can do is it's also available with PiPi so you can just use pip install this image builder and you have it and you can use it it's about two minutes there is an outstanding page yes exactly so this was my idea I created this page to have exactly this inside this image builder element you can just build and we have a test or development for this image builder yes please, sorry I didn't get this no this is the question was is there some ready to use element for data mining is it correct? is there the puppet correct? there is a puppet master element for this one it installs a complete puppet master yes please the question is what about GPT the partition table you think about there is really ongoing patch I hope something like maybe next or maybe in two weeks it will be there and will be in the next release I'm not really sure if there is some time for my questions maybe you at the top I'm only use GPT just install your directory okay when I understood it correctly it's something like how does a disk image builder have different ways of storing images like raw image so okay there are currently, so internally it works in a raw so it has a dup device it has a raw image and after the process it automatically converts it to something you give at the command line QCOW, VMDK, TAR whatever you want exactly everything is ended locally yes the top the question was why does disk image builder not pick up something like an existing project for describing the partitioning the block device itself correctly open speaking I didn't find something so if you can point me to something I'm very happy this was also really so my first touch with disk image builder looked through the internet exactly for finding something which gave me the whole bunch of things yes now which can be done with disk image builder now but I didn't find anything so this was one reason to put it there okay we had this question already it's about GPT partitioning there is currently really a patch outstanding and there is a lot of work done on this and I hope it's maybe merged next week or so any questions? the question is why I recommend against partitioning? no it's not outside the disk image builder yes because it's something like very basic for the image and you won't want to change these things later on so for example I would have something like a problem with it setting up an LVM with Puppet so maybe it's possible but for my opinion this is not really the correct time so do things I'm not really sure if I get you yes you can do this also so you can do something like a partition less thing so you can just use an image raw image this is also possible and then you can just put it into an image yes one question or no? okay time is up thank you