 container talks. We'd call their stuff fat containers and you know they're a really useful tool but we tend to try to think of containers as this slim little wrapper around an application and so we have containers that have only one thing running in them or one thing and its children. To introduce myself, hi my name's Sven. I work for Docker Inc. I'm a support engineer there. I also work on the documentation so I really appreciate bugs and issues and emails saying that doesn't make sense. Can you explain it better? Because then we can and also I work on Boot to Docker which is slowly becoming Docker Machine. Hi. Now most people seem to get the feeling because of our tutorials I think and the way we talk about it that when you build a container you're going to end up with all of Ubuntu, all of the dependencies and you're going to end up with this many hundred meg container image and then if you're going to build from source it's even worse because you end up with GCC or whatever it is that you're building with and I just thought I'd do a quick tutorial to a small tiny room of people about how you can not do that. One of the things when you're trying to manage your containers is you've got to choose a base image and we see people following the tutorial so they'll have some things based on Ubuntu, some things based on Debian, some things based on CentOS and you end up running out of disk space and the thing that when you're actually going to start using containers that you need to think about is that you should try to have a common environment. It's no different to virtual machines in that respect. If you have all the OS's you're going to have to learn how to manage all of the OS's so don't do that and here it's just a list of some of the container images that we provide and their respective sizes. I personally like scratch a lot. It's a little bit large, we're working on it but as you can see we haven't changed it in a while so you know if you find a security issue please report it. I picked Debian because I don't like RPMs. I'm willing to admit that that's because I know Debian a little bit better and it has less to do with system D than I used to think it might. I do use Debian wherever I can because coming back to that table it's quite a bit smaller although the really cool thing you can see there is that the latest version of Debian 2 is a very similar size. It used to be about 300 meg so having picked my base image and I'm going to pretend that the Docker hub guys have done a good job which personally I think if you're going to deploy you shouldn't you should actually make your own base image because then you know what's in it and therefore have some idea of the security risks that you have whereas if you just trust us and don't bother learning about it you're kind of well hoping. Trust us nonetheless but the security guys quite rightfully say don't use Docker pull because in the end you're just running random binaries you got from the internet. So as an example of some of the horrible things you can do all I did was pull out some of the files I decided I didn't need and suddenly I have an image here that's almost half the size of the base image that we distribute. The one we distribute is slightly more useful in that your lakals still work but if you're like me you may not need them. And the way I did this is I ran Docker build on this Docker file which starts from the Debian 7.7 which I think is weasy just delete some files and go and then I use Docker export to create the tar file that is that image and re-imported it as a single layer because that Docker file will actually create three layers right now the from one which is your original image the layer in the union file system that says these files have been removed and then another layer for the metadata. I suspect that works going on to make that into two but nonetheless it's a base image so you'd like to actually squash it down to one and there you see my version and then the really neat trick is to actually use that base image instead of using Ubuntu, CentOS, Crux and you know you so on because with AOS at the moment and I suspect with overlay for FS but I don't know yet haven't used an anger enough when binaries come from the same layer in all of your images when you're running those containers they'll actually share as much memory as the kernel will let them. Slight disadvantage with device mapper and butter FS is that it doesn't share memory so your libc will be in memory again and again and again just a bit of a shame and of course the other nice thing about using the one image is that you'll know if you need to update your base image for security reasons you just have to rebuild all your images there's no do I need to rebuild this that and the other the answer is yes and then the bigger trick is that was just the ops answer the development testing and debugging answer is to make similar images that contain all of your tools I seem to be doing a talk about that tomorrow where instead of your developers getting a list of tools that they need to build your product and each product having a different version of DCC your different version of Java what you do is you tell them for this version of your product use these containers and for the next version use other containers which your ops guys or your dev guys curate so your new developer turns up and goes what do I do and you say well you're working on this so Docker pull today's environment and there it is and they're instantly on a machine that they're able to do work on in tomorrow's thing the example I'm gonna I'm gonna show is I've got net beans running in the container obviously if you actually use Java you'd have lots of other things I just have hello world and the thing I read on a blog post the other day was if you do this with your development environment and your ops environment and your debug environment escrow becomes nothing now if you've worked on the escrow project after you've delivered delivered to customers you'll know it takes a few weeks of reverse engineering all these mad little assumptions where somebody says well I started with a windows box and then I loaded this and I've got dev end but can't remember what version and so on you know you've got to go through the whole process of figuring out all of the unwritten requirements whereas a Docker container either works or it doesn't and if you're deploying from a Docker container it's got all of the tools you need and even more so if you maintain your own base image because well it's a lot more stable now but a year ago when we started doing base images you'd find your Debian or your Ubuntu image which suddenly changed dramatically whereas if you do your own obviously not in it and you only put things in there for a reason okay once we have our images I mean our base images then the next thing is you're going to end up with a development environment docker file that looks a little like this this one's canned example that I'm doing later but you know lots and lots of things happening and each one of those statements instructions will give you a new layer and while this sounds lovely and it is these layers will contain things that you don't care about and if for the sake of argument some here after I've done the make install up there I might do a make clean because obviously don't want the source anymore the source is still in the layers that get unioned up to the final image and obviously we don't want to go and push many gigabytes worth of stuff only to tell it to delete them in the next step so the next thing to consider is is squashing down to fewer layers and there's a few ways to do that actually big question who here's used docker am I speaking to people who does that look like more than half mad okay so I'm showing you commands that that will make sense to those of you who've played with docker for those of you who haven't I'm sorry I pitched this at people who had asked me lots of questions how am I doing for time cool so I'll speed up and then we can have questions so there's a number of ways to squash the one that I used earlier was basically to export and import again people have written tools that essentially do the same thing and they just simplify the process and then there are some really cool tricks that you can do where for example docker run builder in the second example spits out the tarball that then gets sent to docker build it's kind of called the property transform transform because he was the first one to talk about it lots and so that's one way that you can get your docker initial build so you might you you have a docker file in your system that then does something similar to configure build stuff and then spits out a tar file that gets built into your final binary which could look much like the earlier one or similarly the other trick I saw on the internet just recently where you just have a docker run that will download all of the the prerequisites before you then build the image so it's not quite as powerful as the second example but it's still very useful and the last one which is where you're basically telling docker will give you the keys to the kingdom and you can do whatever you like so the thing to note with the last example is that you're handing your container full and utter control of your computer you're giving it rude access which is really handy but hopefully is really scary to most of you I do it all the time I love it okay so the real thing is static binaries and as you can see the example there people seem to like go because they think I can make myself static binary except of course there are some libraries that will still be needed here and there so instead we work in assembler this container here is a huge 900 bytes and spitting out that text is all it does we use it in our demos to make sure that your docker actually works but that's an example of what I think docker containers will work towards because all you really want to have is a specification of not only how to get the thing running but what few things it does so this one in my opinion should eventually have a spec that goes with it that says I don't need access to any devices I don't need any privileges I need nothing I don't need a network all I need is to throw you some characters that's it that's something at the moment that seems to me to be missing from everything RPM Debian you name it that we we don't know from your deb file what devices it's going to access you've got to kind of read the documentation or which files it's going to access and to that end I thought I'd do an example of building engine engine X and making a microcontainer out of it and that's the horror command that I did to make it go I guess I shall explain it okay so the first line there what I'm doing is because we don't have Docker minus build minus F so we can't specify another file as a docker file yet that's next version I'm catting the docker file and I'm not using any of the files in my context so in my current directory apart from that and then my docker file goes off and does all the work it downloads well it installs the prerequisites I need to build engine X and then it gets all the source code configures it and builds it and gives me the things that I need to test it and I'll go into the the lower lines later and so I now have a build container that's huge I think it's 1.4 gig which I can then use to spit out a tar file that contains just the output result which is the second line is basically just grabbing that tar file and then in the third file I'm importing that as a single layer image and running it and that's it and of course it crashes because for some reason it's expecting a Unix system so we use our Linux to tell us what it's up to and I was originally going to use LDD but then I kind of figured this is actually better it'll tell me more and happily it tells me I think I need like six files and some libraries unfortunately can't see the rest but needs etc password etc groups and some nisc config files because obviously I didn't configure this static binary to be as simple as I would have liked and those files I then make into a new table so the first table is actually over it the first one so here's the second table where I'm including all the extra dependencies and there's so many of them so that there is effectively the definition of my new container and the line above I'll use later but I'm adding it into this table so that I can just cat this table straight to docker build to build a new image that has the metadata of how to run it what ports to expose that sort of stuff and there we go having done it the GCC image that I base the original build off is huge engine x adds another 200 meg and then the little beastie that we just made that contains just enough to run engine x is a huge 21 meg I'm assuming that somebody who actually spends a bit of time on it can bring it down a little bit further I basically pull the instructions on building engine x off the net so now I'm running random code okay and this is the well if we go back over here again with that container I was running this docker run command so I'm actually telling it the application engine x to run with the parameters to keep it in the foreground and I really would rather not do that I'd much rather just go docker run port 80 engine x and so what I've done is I've added that docker file to my table and then built it so the first one that I showed before is a docker import the second one is just catting it and doing a docker build with that docker file and the difference in size is I think I shot later but the difference in how we run it are those two in that case these two they should be the same size what I do after that is I go well that's hard to debug so instead of basing it on scratch I base it on busybox and now I can actually docker run engine x busybox SH and have a look inside my file system but I kind of think those file sizes are starting to get closer towards what I'd really like to have and as I say there one of the one of the ways that people seem to be thinking about small containers is to have a microservice in one each container and then to orchestrate those to talk to each other and then put a front end in front of it and for that obviously a small container is a lot faster to upload to AWS or wherever you're deploying please don't ask me to other questions and then for extra credit the thing I didn't do is make two so if we had a base image that was busybox and added those libs in on top of it then we'd be able to deploy that and if the app changes we'd only be having to send that 10 meg or so that is the engine x layer and we're gone huh okay and the other thing I should really show is those are the docker files so I've got the busybox one up there and then down below I'm doing this as well with new base which was supposed to be Debian but I didn't do it so that was about all I wanted to show and now I have a chance to give you answers to questions awesome we have about seven or eight minutes for questions so hold your hand up and we'll get the microphone around to you so this gentleman down here first sorry I'm new to a lot of the things you're talking about he was trying to understand why you were using that I notify weight thing to I mean I know if I was good or you that you waited waiting for the change of what well I notify weight really tells you what files were accessed when and so what I needed to do to fix the dependencies is that it was obviously missing some libc libraries okay so waiting to notify you that you've got the the correct libraries installed before well you start so the engine x image I made had nothing but engine x in it right and so it didn't have any of etc right and I didn't want to go and look through the source code to figure out what files it needed and they don't tell you because these are assumptions you know you've got a unique system so instead I notify just tells you it's a kernel functionality that tells you when files are accessed yeah so as soon as engine x starts up it goes and accesses to fool engine x into thinking no no no so what I'm doing is I'm going through and running this in a container where it works yeah which is the build static one right and watching what it does it what it does to the file system that tells me what files I need that I can then put in a slim container I see it so in other words what you're doing is using I notify to tell you which things engine x was was looking for then you can provide those things only that's right I'm being catch my initial thought before I started work on this was just use LDD but there's obviously data yeah it's more than that so thank you yeah my question and my questions around the same thing with I notify so is there not some level of risk and I'm not a C developer so there's not some level of risk around it you're not necessarily seeing some of the files it needs until later on and definitely yeah definitely yes again my assumption here is that I can do this because it worked for what I was trying to show post that there's an awful lot of things you need to know and hopefully if you're deploying it you know but that's what what surprised me when I started thinking about this is that none of the package management solutions we have right now document that because the assumption is that if you're installing a dead it's a dead like system and there's a certain minimum about 85 megworth of files that are in a base debian that works and that seems a lot less secure than knowing specifically I'm gonna touch these six files and that's it yeah I had a collection of containers which shared some common base images and I was trying to produce a single table they had though just a single table which had the base image and then just the extra extra images on top and I ended up writing a script which which built a merged repositories file and just and then just built a new table with all those images in it is did one of those shrinkage commands that you showed there do something similar or was that was it even valid not of the ones no I don't know what the Docker squash really does most of them are going to end up basically just extracting as in doing a Docker export well though I think Docker squash might take each of the layers and squash the ones you ask it to so you've got to hope that it does the right job right yeah but that the tooling around that isn't done because we're it's not something that we're supporting ourselves directly and third-party people are trying it out and seeing what happens so no can't answer your question much so you were using engine x which is statically compiled what would you do differently for dynamic languages for which dynamic languages like Ruby's dependency management I would ask somebody who's an expert in building those things to do it I've got a little more chance of doing it in Pearl but even that I kind of went that's going to be complex and it probably depends on the complexity of what you're trying to package up if it's something nasty like the open-source project I used to work on fast wiki and twiki then I'd probably actually just set up I notify and bring in everything because there's obviously a lot of system tools that are installed but we don't actually need the the infectious thing about using I notify is that if you can exercise all the code in your app then you're going to get a canonical list of files it needs and its permissions that are going to be needed so like for example engine x I think only wants to read the password file it doesn't want to write to it but then we could create a container that only allows it to read and certainly not to write and it would exit on failure therefore we've learned a lot about whatever it is that we're packaging and so I kind of think most of this is going to be a learning thing and eventually we'll know the answer to that we have time for Tim oh sorry sorry I didn't mean to interrupt you we have time for two more questions does Docker provide any tools like system tap or detrace for internal to containers I think I'll punt on that and just say Docker is a thin little application launcher beastie that sets up some kernel parameters to wrap around what an application can do and see and and so if you want to do any of that kind of thing what you're looking for is the normal Linux debug and kernel tracing tools and whether they can enter your namespace is a little bit painful sometimes but it's not something that is intrinsic to Docker it's a Unix system I did something similar to this with my SQL I just used the copy the binary in and basically kept on running it and to tell a complaint about the library copy the library in repeated exactly I was using the busybox container as a base and what I found is that the busybox has a number of different tags and the busybox Ubuntu with the Ubuntu tag had most of the libraries in so I suppose my question is would that be the correct way to do it or would there be is that the intended use for the different tags of the busybox or I think the answer I was given to why we have these is because people have religion sorry I think the the main answer I was given for why we have these different busybox images is because people have religion and so some of them will prefer the the build root one and some of them will go build root ill and so we give them the Ubuntu one but no I didn't know that there were that many more libraries in there learn something new so we'll let the very last question be this gentleman right here um do you think that a significant portion of this process could be made a lot easier if there was an explicit syntax for layer creation rather than an implicit one on every single command almost like a checkpoint declaring this is where the layer starts this is where the layer stops I don't think there's anybody who feels that having some way to control when the layers get made is a bad thing the reason we don't have it is because we we've tried a few implementations and none of them satisfied everything without causing more trouble I did read somebody had tried something else and it was so far looking very successful so you know there is hope the problem is that in the Docker project because we've become so successful so much faster than we expected people want things before we've experimented enough to know whether we're going to shoot ourselves in the foot you know and so like to me bind mount volumes is a lovely example of something that everybody wanted that is going to cause trouble for the rest of Docker's existence and we've learned from that and so we're you know much slower about just saying yeah we'll do that and writing some code and then going oh shit so yeah yeah definitely it's you know being able to control all of those things is important it's just a case of get it right cool thank you Spanish that's all we have time for today but thank you so much