 Okay, hi everybody. Good afternoon. My name's Ewan Slater and I want to talk to you today. It's going to be short talk, 20 minutes, about why you want to shrink your container and some tools for doing so. A couple of disclaimers before I start. One is I work for Oracle, but what I'm talking about today isn't actually my day job, so I'm here in a personal capacity. All views expressed to my own. I've been a minor contributor to Smith and Crashcart, which the tools that I'm talking about, mainly as a user and tester, and I'm not a dietician. But I do know that if you eat too much of this, then you will get fat and your scales will tell you horrible things. And the consequences for people of getting fat are not good. I mean, I'm fat. This is horrible. You know, you can diabetes, heart disease, stroke and you know, thousand other horrible ways to die. So fat's bad for people. It's not great for software either. We tend to see layers on top of layers and it never gets smaller and the developers wear bigger and baggier t-shirts as time goes on. And this software starts to become too big to fail, which very rapidly means that nobody dares to change it. So it's too difficult to change. And there's a CTO that I was speaking to in Finland about 18 months ago, and he said he's worked out that it would take about two years to deploy Hello World as a feature into production on one of their critical systems. So the software's overburdened, it's difficult to deploy and the difficult to deploy makes it difficult to scale. So fat software's not great. So how do we start to make our software or our overall stack thinner? Well, we start by trying to lose some layers. So when I started out, the only way that you ever did any deployments was physical. You had to worry about hardware. You had to worry about the operating system. And you had multiple services and applications running on a box. And the box was often shared, which meant that various people argued about what level of the OS should be at, what libraries should be on it. So then we've managed to lose the hardware layer by moving to virtualization. You may have multiple services and applications in your VM. And you've got the operating system, but at least it's your operating system, and you decide what release it's at. However, carrying the OS around just to run an application on is, it's heavy. So that's excess weight again. So that got shared when we started to see people move to containers, and the containers typically have a single service or application in them. So everything's great. We're looking really thin. But toffee. Anyone ever heard of toffee? Then on the outside, fat on the inside. So there's two chaps here. They're the same age, same gender. This guy's got about two litres of internal body fat. This poor sod's got nearly six. They both look the same. They both look thin, but this one's fat on the inside. So why is my container toffee? Why doesn't my container look thin on the outside but it's fat on the inside? And the answer is that temptation can be very hard to resist. I think this is a really good quote. It's just basically pointing out that Docker makes it too damn easy for you to build containers in a way. It makes it too easy. You just go grab an image, put it together, and it's really easy to create a big container full of things that you didn't necessarily want, which can include security holes and things like that. The Docker manual itself says one of the difficult things about building images is controlling the size, because each command lays down a new layer, and then you're responsible for kind of cleaning it all up. And what's wrong with fat containers? Well, for a start, they're physically big, which can make them a pain to distribute if you're sending them over networks and stuff. You can easily get up to nearly a gig in size. You can use Alpine, but that also brings its own issues. And there's just a general issue around bloat. Again, these things just get bigger. The other thing is that for containers as well as for people, being fat can make you feel insecure, or it can actually make you genuinely insecure. So if you put the whole of the Linux user space into your image, that's a pretty big attack surface that you're giving somebody. It also means that if somebody manages to compromise your app and get into the container, there may be lots of other stuff kicking around in there that they can then pick up and do damage with. Another area is around vulnerability management. Containers are immutable. We can't patch a running container. What we have to do is we have to tear them down and stand them up. Now, if you've got a whole fleet of containers out there and a vulnerability comes up in a library that's shared across all of them, you've got a number of questions. One of which is, does this container here really need that library? Is that code actually being executed? Do I need to tear all these containers down and stand them up? And then you kind of go, well, yeah, to be safe, I really need to do it to all of them. So what you want to do is to only be patching the things you need. Now, this was something that I actually came across in real life. This is how I got into this area, because I was working with a customer and their security department were precisely asking these questions. How are we sure that the code that is in the containers is the right code? How can we be sure that we are tearing down and replacing containers as necessary, but not too often? So this brought me into the concept of microcontainers. So a microcontainer contains only a single executable. It's only the thing that you need to run and the dependencies of that executable. It runs with a read-only root file system, and the files are all owned and read by a single user. The result is we eliminate the layers, so we reduce the complexity of the image. It's fast and easy to distribute because it's small. We've got a smaller attack surface because we've only got the essentials that we need in that image, and we have certainty over our vulnerabilities. If there is an issue with a library that is contained in that image, then we know that we have to tear it down and stand it up because we know that that code is there, it is required. So made this case back to the customer security department, and they were like, okay, yeah, we're good with that. So having established that we want to have microcontainers, how do we create them? What's the ways of doing it? So the oldest way was to use something called a BuilderPan. Has anybody ever done this? Yeah, don't. So with a BuilderPan you have two Dockerfiles. You could have more if you really wanted to. You've got a development Dockerfile which creates a fat image, and within that fat image you're going to actually do the build of your application. And then you have a production Dockerfile which is actually going to be your runtime environment. And you have a build script that is going to run the first Dockerfile, build the software that you want, and then extract that out and copy it into the production Dockerfile. Now, there is no reason in principle why this shouldn't work. But experience has proven that it's actually quite difficult to maintain and it's error prone. So nice people at Docker, as of Docker 17.05, came up with the idea of multi-stage builds. And with a multi-stage build you have a single Dockerfile, and within that Dockerfile you can create successive images, and you can copy from one image to another, and then that gives you a single final image that you can then go and run. So here is an example of a multi-stage build. This is actually to create a static website that's running in a container. So 1 through 7 is basically our development Dockerfile. We take the Ruby as our development image. We install a few other things including Jekyll, which is our static website builder. And then we run that. So we've now got our static website. And then these three lines are our production image, which gives us a nice skinny nginx image. We just take the site directory and copy that in, copy in our nginx config, and we're done. We've got a nice skinny image with just what we need for our website. That's fine if you've got something like a static binary or a static website, something that's self-contained and it's easy to say what you're going to move. The issue is if you've got dependencies that aren't easy to wrap into the image and things that you could potentially forget because you could be sitting on the plane with the person who's forgotten their Teddy or the business user who finds their functionality isn't working. There's another feature of Docker, an experimental feature called Squash, which as the name suggests, you basically squash everything into a single layer at the end. So you're effectively doing that. I've tried Squash and I've found it doesn't really do very much from my point of view. But again, it's not doing any kind, it's not helping you with the vulnerability management side of things. You're just squashing everything down and if there was something that you didn't want in the suitcase, that's still coming with you. So it would be really nice if there was a tool that gave you an image in a single layer, did some level of automatic dependency resolution for you, put everything as a single user so you get idempote and builds, and more secure images. Now the good news is that there is, there's a tool called Smith. So this was a tool that was created at Oracle and is used internally at Oracle to build microcontainers. And I found out about it when I was going around and asking the questions that had been prompted by the security people. It's created at Oracle, we open sourced it so it's under Apache license or UPL, so free to look at it, use it, whatever. And it works in a couple of ways. So the way it's mostly used inside of Oracle is to build micro containers from YUM repos and RPM files. The way that I've used it most is to shrink existing containers. So you put a standard container in one end, you bash it with Smith and you get a microcontainer out at the other. So if you're shrinking a container, you download your container in OCI format, open container initiative or point to its URL. And then in both cases you define a Smith YAML file, run Smith, that gives you an OCI format image which you then need to load to a Docker repo before Docker can run it because they don't run OCI format images yet. So this is Hello World by Smith. So we're basically, we're taking cat from the core utility package and we're going to cat out whatever is in slash read slash data. It's actually quite big. Anyone ever looked at the source code of cat? It's quite complicated. Bloody hell. This is one of mine, this is shrinking an existing container. So here's the image that I'm taking things from. There's the path to my application. I happen to know that I'm going to need Ruby, put some of that in, and there's my command. And then what happens is that Smith unpacks the layers of the image, copies out those files you've specified in Smith YAML, loads the library search paths and recursively copies in the dependencies and then puts it together in a single layer and packages it as a new OCI image. So standard HTTPD image, if you pull it down from Docker Hub comes in about 180 meg, bash it with Smith. You can get it down to under 5 meg and you've still got a working web server. My dog's body container was nearly a gig and I can get that down to under 85. So my experience for best results, unless you've got static binary or something self-contained, is build your fat image hammer it with Smith and it's easier and quicker than a multi-stage build I have found. For dog's body, I was trying to do it with a multi-stage build and my image was up to 150 meg and I still hadn't got all of the dependencies. So automatic dependency resolution is really nice. Whichever method you use, I hope that you would see the value of using microcontainers as opposed to big containers. Now, unlike me, you all write perfect software so this never happens, right? What do you do if it crashes? Well, you know, it's microcontainer. Yeah, please don't do these things. I mean, you start to get people go, oh, I'll always run SSHD and start my container as well and then I'll log in or I'll embed tools and it don't. I mean, it is the equivalent of going to the gym and getting yourself all fit and then going down the pub and having about six points to reward yourself. You might as well have just gone to the pub. A lot less effort. So there's a number of challenges with debugging a microcontainer. You know, you can't just mount a directory into the container on the fly. You can't log into it. If you just restart it, you may well not create the same error conditions you had before. Even if you could mount a container into the, mount a directory into the container with your tools in, you know, the tool location may be not so great and you'll get potential conflicts with the paths in the libraries. So there is a requirement to have an ability to investigate the crash site, not just restart and hope to see the same thing again and to focus on solving the problem rather than container hacking and have your debug tools available only when you need them. So we've got a complementary tool called Crash Cart, again, open source. And what that does is it allows you to debug containers or microcontainers so you attach to the running container process. Remember a container is a process, not a VM. You attach to that and then you're able to sideload an image with Linux binaries for your debug tools into that container and you can use those sideloaded tools to debug the running container. So for example, if you wanted to see what files were opening the container at the point it crashed by running LSOF, you can put that on the image, load it in and see what was going on. So those are the options that I know of. Please feel free to get involved. You can fork, Smith or Crash Cart. I've created a couple of labs which are up on my github that you can use to go through those. We've got Slack channel, comments, feedback, pull requests are welcome. I'll take takeaways, you know, make smaller things, only put the things that you need in, use the smallest container you can. And the benefits you should see for that are greater simplicity, better security and improved agility. And I think I've got a minute or two for questions. If not, grab me later. I'm around all day or get me on Twitter. Yeah, I mean, there's different ways of doing. So the question is why would I use Smith and why would I not just start building my stuff up from scratch? You can build your stuff up from scratch. I found, or the people in Oracle who looked at this were operationalizing containers came up with this. So they're using, they are creating microcontainers from Yum Repos and building the packages that way. My personal use cases have been ones where I am creating applications and then I want to get a microcontainer out. And I have found this the easiest way to do it, principally because it's made sure that I haven't left any of my dependencies behind. So for development, I just tend to go with the standard Ruby image to start with, which produces a pretty fat container out at the end. And then the actual image that the microcontainer is based on is Oracle Linux Slim. Sorry. Can you come up and already set up and then we can continue with questions. Sorry. Just to add to that, if you start from scratch and if you want to start with Ruby or with Go or anything else, you still need dependencies. And if you start from scratch, you need to compile all that yourself. It is a user image that already has all of that. And once you're happy with your results and your application is running, you just leave it out. If you start from scratch, you will have to pedal a lot more. Yeah. And like I said, when I tried doing it with the multi-stage build and building it into Alpine, so it's kind of not quite scratched but going down that route, it was precisely that issue I had was that my container was getting bigger and bigger. And I'm still, when I'm trying to run the application, it's saying I haven't got this or I haven't got that. Whereas if I've done it with Smith, I know I've gone all the way to the bottom of the dependency tree. Yeah. I mean, I'm a lazy bugger. So I would just say use the simplest thing that you can. And if you're creating a static image, yeah, everything's wrapped up in an XE or like the Jekyll, your website's in one directory. Why do something in a more difficult way?