 Hi folks are ready. I would like to introduce Tomasz Tomecek who is the part of the modularity team and his talk Containerization the best practices or worst practice So hey, I hope that you had a great lunch Thank you for interaction Peter. So I'm Tomasz size as Peter said I'm a modularity team and my focus is containerization actually so hence the talk Containerization So usually everyone is making talks about best practices like best performance elegant solutions So and I realized like okay, let's let's try the other way around. Let's try with worst practices So hence the talk So what we'll be doing today, so we'll be doing worse practices most of the talk and then I Will tell you why they are worst and what's the best way to do it actually so Just this agenda in the worst practices will cover several topics which are there like most common Most commonly being missy misinterpreted or something like that. So it's like six or seven topics So but first please don't try it in production. I mean they are really worse practices And if you try I mean, it's okay to try it at home, but don't try it in production. I mean seriously So so what we'll be doing will be actually creating an engine an engine X image And we'll be doing the worst way ever So let's do it So before we start we need to figure out like the platform what we'll be using for our image So in context of darker images, this is called base image So and how do we pick it? So these base images, they're just like tiny Linux distributions usually There are all sorts of variants and if you like open darker hop you'll see like tons of images And you have no idea how to pick the right one. So what is the criteria? So like for me the first thing is that I usually pick the thing I know So for example, I use Fedora so I use Fedora images If you know something else like pick the thing you you know Also like support and what happens if there is a CV in the image like will the distribution update it It would be updated tomorrow or in a year or it's already like Five years old full of CVs. So these are all the criteria for like picking the base image And then once we pick a base image, we need to install the software itself. So in our case, it's engine X And again, I mean we can pick it from distribution or we can compile it ourselves like it's pretty much up to us Okay, so let's start right the darker file So this is what we pick the worst thing you can pick is like just random image You found like today and everyone's talking about it because it's new and cool and secure And it was released yesterday and it's still in beta, but you can use it in production So that's for me. That's like the worst thing to do and when you pick this obscure like platform or whatever it is Usually it's like hard to install something inside. So you just Google like, okay How do I install engine X and you find script and it's like, okay, it solves my problem So I just curl it and I pipe it into bash and just it's just running and everyone is happy But as you can see it's like the URL is pretty long, right? So let's like let's use bitly and just shorten it. I mean this this looks way nicer, right? So, yeah, so we have the software great. I mean Let's run it Before we run it. We need to figure out configuration So usually in these days, most of the services are configured with config files. So yeah, there's a standard In case of darker images like most of the people What they do is that you can configure the image by passing environment variables And then when you start the container, there's like in its script running Which like parses all the environment variables and prepares the configuration and then runs the service like this It's like the usual and it has pretty nice user experience. So Like other other way you can configure it is that you will prepare the config yourself And you just either like mount it inside the container, which is like Okay for one node But if you are in cluster like you don't want to like SSH into the note and put configs there and Or the other option is to just overlay it. So you basically pick the image At the config and that's the that's your custom Customly created service with custom configuration. That's also okay So all of these options have ups and downs and it's really up to you to pick the right one So okay, it's configured we need to start it as I said like you can either start the service Like with the binary or you can write some script which will do something and then start the service So these are all the options Yes So how we do it the worst way so So the worst way regarding configuration is like just don't configure it I mean, okay, or make it hard for people to configure it The way you put there some really complex script and if the people try to like configure it It breaks and it doesn't work and you find like thousands of issues like it doesn't work So it actually happened to me like recently when I tried the wordpress and Maria DB image Which is official image on Docker hop and just didn't work. I mean it's like okay, it's and it's the official image So so yeah, this is how you can like make things hard for everyone and regarding starting the service So let's say that our requirement is that whenever the service fails We wanted to restart it. So like in my opinion, that's the job of the orchestrator, right? So for example current is open shift. They should take care of it But I mean we can still install supervisor inside and let's supervisor like handle the Service itself, which is like another layer of complexity. So I think this is the worst thing you can do but some people do it and We'll see So so next topic metadata for me, this is super important and I really said that like No one figured out the ad or like there's not much work being done in this area So when you download an image from the hacker hub the first thing you do is you open like the web page and Read the read me file and see what are the options and how to use it. So why this Information isn't already baked in inside the image. So why would I need to go to internet to see how to use this image? I mean, I want to have this data inside the image and it's like no one does this So yeah, when I download an image, how do I know what's inside like how can I use it? Like Impossible you need to Google it and find some documentation. It would be great if you just do like man This image and you would open a man page with all the data So what's the worst thing we can do like obviously we can just don't write anything Like it's up to you to figure out what's inside so if you're in a company and you maintain some images and then you leave the company and you don't feel this Metadata and then next guy comes in and it's like, oh my god. What is this? How do I use it like? So is there and is there anything which could be even worse? So how about we fill in wrong data like okay? This is engine X image, but actually it looks like it's a database and HDD PD and runs on this report I mean yeah It can happen that you have back in your infrastructure or you just like you had a bad day And you think you are in this repo, but you are in other repo, but like filling wrong data is really bad Okay, what's next topic image size? Yeah, I could talk about this like whole day. This is like huge topic So but first we need to start with some like theory so layers Yeah, this is the technology the way images are like The images work so you have all these layers like one image is composed of 12 layers And then they use copy on write technology to like layer it on top of itself and you get a container So you can imagine this very easily with Lego you get you get 10 Lego pieces like there on their own They are like okay one piece, but what do I do with it? I can I can't do anything But if you want to build a tower from the Lego you just put one piece other piece on top of it And then you get your tower and that's actually our container. So this is the theory And the the most important thing is that each of the layer is immutable You can't change it like if you put something into layer It will stay inside forever and you can't do anything about it If you have an image push it to registry pull it somewhere else All the layers are being downloaded and the data which is there is like forever and that's super important and like So the only way to like change it is to actually rebuild the image again And we'll get to that later actually so What affects the image size is these layers like if you put something which shouldn't be there It will be there forever and your image is just like getting bigger and bigger and that's like five gigabytes I and you have like ten versions of them. So it's like 50 gigabytes on your disk. It's that's definitely not what you want So what are these data which usually reside inside? So for example if you want to have an image and you want to compile the service yourself So in order to do that you need to put like all the compilers and dependencies inside the image then compile it and All these dependencies are actually like will stay there forever So it's like not pretty good like why do I have in my production and Gnex image GCC like I don't need that So, yeah, this is like all the redundant that data which doesn't need to be there So it's not just like built two links also like documentation or locale and that kind of stuff like do you need it in production? Well, maybe I don't know Rather than services as you can see so I installed supervisor D inside and like do I really need it? It's just like occupying space and maybe it's making it more complex. Like I think it shouldn't be there So what do we well? So, how do we make this bad? So for example, we can do something like this like we need to install several packages and our developers love Z shell and Tmax so they need it to have it in all the images So we need to install it and we can do it in like all like every installation in one Instruction which means the like all the metadata will stay there and then there is like next run and another new metadata and like Lots of metadata, which is just useless and shouldn't be there Yeah Let's continue. Okay security Yes or no. I guess no So Yeah So this is the example for security like so we have this git On our Amazon and we have the content for danger next there And it's just listens on SSH because our security people are crazy and we need to get it inside But we don't know really how so okay So we just put we just put all the like SSH keys inside the image We call them the repository and what we actually care about security. So we remove the We remove the keys so like no one can like bridge or get inside And as you could see in the previous example, I was also installing SSH there But like if you have SSH inside, you'll need to like authenticate somehow So like you can either have like certificate there or just set like password for route and just like easy and convenient So let's just do that. So yeah, this is pretty much our image now Okay, and I have a set demo prepared for you And the reason which is set is not like which the reason it is set. It's not that it's like worst It's other reason. So so when I was preparing this talk, I had an idea like okay So I want to do like a really great demo which would like around laugh. So I was thinking about it Like how do I do it? So I realized like okay, so I can create this worst image which has some like Vulnerabilities and let all of you like pull the image and like find the vulnerability and like hack the system I want you to do and then I could like make a competition and so I brought these like prices for all the people So I tried it yesterday like okay So I just try to pull the image like if it works and like have the cool demo for you And I realized that the Wi-Fi in here is really horrible and you can't like pull anything It just times out and so I realized I won't be having my super cool demo then So yeah, it's a set demo because I will just show you like why it's so bad, but I wanted to like had it Like way better, huh? Yeah, definitely, so Yes, yeah, so just ask some like very good question Okay, so let's proceed to the demo and the demo is Actually like I will show you the image and I would like to show you like So the reason that it's insecure I mean I I imagine that most of you know like what's wrong But I would like to show you like really how it's so trivial to like if you have some vulnerability in your system it's hot that it's so trivial to like break in and Get access to the systems so Yeah, okay, and I need to put my terminal on the other screen Yeah, so can you see it? bigger Yeah, that's gonna be a problem actually because I'm running this super cool terminal emulator Emulator ST if you know it and you change configuration by recompiling it so So just give me a 20 minutes and Okay, that wasn't 20 minutes so I can close this small one Bigger one so is it better? Yeah, okay So so I already built the image so we don't have to like spend 20 minutes here watching on the endless build So this is the this is the image Yeah, it's here. You can see it has nine hundred megabytes. So it's like Just with nginx and if you if we compare it for example to like Federal 25 image on its own is just like 200. So yeah image size matters definitely Okay, so we know that there is in maybe some layer The SSH key and we want to get it. So what do we do? So first thing like we have the image we pulled it And we need to like export it or like get the archive or get the image itself so we do it with darker save command and Right now what's happening the doctor is getting all the layers and make his Tar ball from it and it just puts it inside their file Yeah, I take some time because nine hundred megabytes, right Make your image is small So and as we have it, I'll show you like the actual layers Okay, so right now I will open midnight commander and Here is our image and So as you can see here here are some like weird directories and these are the actual layers and There's like several of them and then there is also some metadata. So as you can see like darker image is actually just like a List of like root file systems and some metadata like it's not rocket science. It's actually really easy So and one of these Directors has have our keys. So let's see. So again one layer is again like root file system and some metadata so we can still open it and Yeah, so you can see that this is probably the base image. That's all the like file system structure So that's not what we are looking for So what's inside here? Well, that's looks way better Wow Yeah, but these are not our files. Actually, these are white outs So what white outs are that you could see that I was doing like RM the keys And what's actually really been doing is that darker will create these files these white outs Which just tell the engine that okay, so these files shouldn't be like Available in the final comma container. So just like hide them or something like that. So so our files are still there But they are hidden So, okay, this looks like dangling simling How about this one? Yeah, this is probably the One of the packages which were installed Let's go even deeper Still not it probably How about this one? Yep. Wow. Here are our keys. Wow. So it wasn't that hard, right? It's like 40 seconds. So so let's copy them and Oops, yeah So federal height Okay, so let's just reset our terminal. So it's so it works actually And so let's try to SSH to the system. Okay, so I already So as you could see in the darker file, okay, so I'll make it the I'll make it the Better way. So the way we get to the like to the hostname so we can do the History which will like show show us all the traces or all the way the image was built and We also do these options. So it doesn't try and Kate any output. So we have like everything So you can see like we can see all the All the commands which were used can also see that this is probably like Fedora image since here is Adam's maintenance Yeah, and that's this actually what my the random image was that I just took the like the oldest Fedora image I was able to found and I just like make it the random image Okay, so But we need to find the hostname of the system so So let's do grab. We know that it was like easy to something Yeah, it's in here. So So we can do SSH that e our key root I guess And let's try it. Yeah, we are there I mean two minutes and you can like hack any system which has like SSH keys inside the image and it's like Available online. Yeah, and I also Can't forget to tear down the VM. So So no one else can hack it So this was out my demo It could have been way better, I'm sorry about that So let's get back to our presentation So and now we are getting to actually the best practices. So let's see what was so wrong about these weird examples Okay, so so the beginning we use some like whatever obscure image and we just found a script and try to like curl it during every build and Pipe it into shell. So what's wrong with this approach? So first thing is that like we We made like our infrastructure on top of something which we don't know probably don't even trust So definitely like when you start like figuring out what what's the base image gonna be for your Services like first pick something which you know, like if you are using I don't know Ubuntu DB and Fedora like probably pick that like you know it you probably trust it since you are already using it So pick that one place Next if you want to install some software So doing curl which means that you do it like on every build So what happens when if someone changes the script and for example puts like and would like to and When someone changes the script and put some like vulnerability inside like what's to copy or keys and upload them to your server So like make it reproducible like download the script actually see what's inside and change it or whatever But just don't like do this and the funny thing is that this is actually the installation method of docker engine It's like curl day script and pipe it into shell and install it. It's like wow So yeah, this is how you how you should do it like base it on something you know And then when you install software install it from like the trusted repository so for example for me that's like official repositories for my distribution and Also, if you are planning to use it in production like you don't have to install like all the documentation that kind of stuff Like for development, it's great But for production like you don't probably need it You still have the man pages on your computer and it's like probably the same thing runs inside the container Yeah, and regarding configuration So as I said like there are different methods to do configuration So if you are in clustered words, it probably makes sense to over overlay the images with your configuration Which is stored in your git repository somewhere like because it's reproducible you can easily change it and And you don't have any dependencies on the nose like you probably don't want to host files Inside the note. It's like not probably your pet. So yeah overlaying is fine But and also I mean on the other hand You can use the way approach with environment variables like that's most of the people are using it has very nice Experience you just set some variables and it works So that's also fine Yeah metadata so obviously like put the correct metadata in there It's also very nice to put like information like when your Docker files are stored in git repository It's great to like put the information about like which git repository what was the commit? So then it's like very easy can easily trace like To the time when the image was built and you can see like the whole built environment which was like used for for the built itself What else yeah in the other part of the example so Chained the instructions like you could see that like every this instruction created a new layer Which means that for DNF like it downloads metadata and they stay and they will stay in the layer So if you chain it you can get all of the You can get all the work done within one layer and You know files will linger in the final image And same thing goes for the like the security bad example like the way you could do it is that you can like download Download the thing in one layer then unpack it install it and then remove the archive because you don't need the archive in the final image Okay, so yeah, this is the like the preferred way like put the correct metadata Don't install like useless software inside, which just doesn't make sense And if you need to like do some actions like do it within one layer And finally So I wasn't much talking about like how to Like what's wrong with this? so So so the thing which is like wrong in this like what darker does when you Have this in your docker file is that it will create shell and this shell will invoke this script actually like it's script It's just like it's not for exec Which means that like your container will be actually like so you have like parent process Which is probably something from the container engine Then you have shell and then you have like your service Which means that all the signals are being passed to the shell and then not to your like to your service Which is like not what you want, right? So you have your service. You don't have like shell running Like you didn't meant to run shell inside. So the way to solve this is that you can run the service as like exec and the service which means that the exec will eat the shell and There will be only service running inside their container. So please don't forget about it It's like pretty important. Also other thing is like Reaping zombie processes which Lukash was talking about this on his workshop for system D So again like the P1 process should reap Reap the child processes. I mean, this is like what's it was it's still not being like solved well I mean, I think that with darker 1.13. There is a new argument which Makes which like the so Okay, I need to drink and think about it So in one 1.13. You can say that That use this binary to to be like parent prep process For all my containers and this parent binary is like a very tiny like C program Which is able to reap the child processes. So yeah, I think this is available with ducker 1.13 But until now this like wasn't solved that there are like multiple these tiny in its system projects available And I'm not sure like if anyone was using them or what like if anyone had issues with zombies in containers So yeah, definitely the way to solve this is that like invoke Invoke the service directly and make it in the exact form So you are sure that there is no shell running inside your container and also for security Like you don't need to run your container as a route so like why would you do that if there's some like security vulnerability in kernel and The user is able to escape like from the container to host and if it's already root inside the container It's again root on the host system. So definitely change the Change the user to some unprivileged user Yeah, and that's actually it so yeah in 30 minutes. It was pretty fast So all so the slides are available on github. Also the demo I built is on github. So if you want to have fun can Check it out. So if you're interested like most more best practices within the project atomic We have this guide called container best practices and it contains like way more information than I just provided It's like really like database for best practices for containerization. So definitely check it out. And if you like Look up with me. It's my Twitter handle. So just say hi And I'll take questions and provide Some like stuff from redhead for you. Yes Yeah, so the question was if it makes sense to Specify labels on one line like within one run instruction or multiple So so the answer Yes, it definitely makes sense I mean, that's a good point like you can even chain instructions or like label or N for stuff like that So the way and but but there's not not there's not the answer only answer Also, this got changed within like darker engine So usually in like 1.10 or 1.9 like every instruction cost a new layer even like changing metadata Which obviously doesn't make any sense, but then they change the like data model of images So right now with latest versions of darker engine A darker image is actually composed of like list of root file systems and metadata So met so changing metadata no longer causing causes new layers So yeah, I mean definitely make sense to do it on one Within one instruction, but it's like no longer needed So yeah, I can pick any price you want Any other questions? Yes Yeah, so so the question is that if you if you heard of shell script within one layer It's not easily readable right and like if there's some tool to like help with it So what you can do you can yeah, you can either make it on one line and it will looks ugly Or you can actually put it inside like external file and have it Like nicely there and just add the file into the container. It's also fine And I'm not aware of any tooling to for doing such thing So again big pen or Yes So the question is what component is responsible for reaping zombies in the new version of darker, right? Okay, so It's actually pretty funny story So so when they like when they started approaching this problem and when they try to solve it They first wrote it on their own So they created like a new project for that just like a little C binary which like all it was doing It was just executing and then reaping like like very simple thing and then at some point they used They completely scraped it and created used like an community project for that So yeah, it works like this So you have the binary on the system and you tell darker that okay use this binary to invoke all my containers So it's like an external project. So if you ping me afterwards, I can even point you to like a real code or anything Any more questions Yes, please I'm sorry, but I'm not sure I follow the question or like Yeah, okay, thank you. So so the question is what's the difference between adding the files within one layer and with multiple layers so like more layers the So the number of layers Makes the like container start up slower because if you have like thousands of layers the kernel has to like merge these Thousands of directors into one it just makes it slower. So it's definitely like good practice to have as few layers as possible So, yeah, that's the main probably the main answer More questions. Yes So the question is like if we bake some like files or keys inside if we are advised to do it in last layer Like or what's the best way to do it? So I Try to dodge this project. I mean this topic in the presentation because it's So so the solution for this problem is like being worked right now. There was a Talk about this on because I think it was yesterday and it was held by Pavel Odvodi and Christian And I don't remember his last name So they are working on a project which is like secrets management within containers So they have a like separate service which holds all the secrets so like passwords and keys and They are working solution like how to integrate it when container infrastructure So and this is the reason why I didn't talk about it at all that there was already a Talk about this project. So definitely like check it out. It's called custodia the project So and then you can just like integrate your application with this service like just So you don't need to care about like putting Secrets files inside your containers because your application will be integrated with service Anymore yes, please Yeah, thank you for that question. That's a very good question So the question is like why didn't I update the image with latest packages, right? So For a long time we were in red we in red hat We were not sure like what's the best way to approach this problem like having latest packages inside the image So for quite some time we are updating the image with latest packages But then then at some point realized like it's not very good because it means that it can actually break your image Like you you have you pull an image and you use it and it works fine And then you update it and it's like it's possible that like new update will break your image or application So then we realized that we need to treat images like and like an artifacts which So we need to treat them in a way That we don't update them like the update should be done like within the build system for the image So if you need like new packages you need to pull the new version of image because it's like being tested like one artifact and Like the upstream or the vendor is sure that it works So it delivers it to its customers or users and everyone is sure that okay when I pull this image It will work, but if you update the packages like no one tested that like can break So yeah, this is our strategy right now that we don't suggest updating Like within the build process. We don't suggest updating I mean there's also a possibility that there's a bug and the easiest way is to update like packages Or just one package so that's like a workaround but like we don't suggest updating images during build process I mean updating packages We have ten minutes left so any more questions. Yes Langdon Yeah, okay, so I'll try to repeat that like you still can like create new base image Where you do the update and like base your images on top of this new updated image But what we don't suggest is like updating during build process of like some layered image So you can still like create new base image with updated packages like that was a good point. Thank you. Yes So the question is which version of Docker are am I suggesting to use in production or? Well, that's a good question and okay, honestly, I'm an engineer. I'm not an like Administrator, so I don't use Docker in production and it's any server so So with this information like would you trust me if I told you like use this version like this one's best So like what I can tell you is that? So Docker has like pretty quick like process of releasing new versions of Docker And what's happening like quite often is that like one thing works in this version then it doesn't work in your version then it works again and then again doesn't work and What like docker a sub stream they suggest like to use the latest version like all the time So so I can you guess you can either go with that or like stick to the version Which like works for you because I remember back in the days when I was working another project and we were like Doing some service on top of Docker. We realized that okay So with 1.8 it works and we upgraded to 1.9 and it didn't work So it's like so what are we going to do right now? And then we upgrade to 1.10 and work and something else broke. So yeah, I mean with most latest version Hopefully it works, but Yeah, yeah, that's So so to elaborate on that so what we do in Fedora and rail with darker We have like version and we try to put a lot of fixes Which are even not in upstream version of darker So let's say that someone fixes in like developer branch in Docker some bug and we cherry pick it into like the latest stable version So actually I think that like our version of Docker in rel and fedora is even more stable than what darker has like because we we are like fix and We fix all the bugs which are in latest or whatever version and we backboard them into like last table version Which we have in Fedora on rail Yes Yeah, so So the question is about the latest tag. Yeah, that's also a good question So again in our in in best practices I hope that we have there that we suggest using like Custom take to not use latest. I mean the latest stack is really convenient like okay So I want to pull this image, but I don't know like what version so I just latest and hope it works But in your images, we definitely suggest to like pin to a specific version So you are sure about the API within the image You know that it works and then when you think like, okay, there's like two new versions So let's try with new version and then you can test your software your whole application in the new version and if it works you can just upgrade and if not like fix it and Yes, Christoph So Christoph comments was to use like a specific as Shah reference to the layer So you are precisely sure what your image you are running because even with tags you can like overwrite them and like break it So, yeah, that's a good point. That's what we've been doing back in the project to like to pin to a specific layer like to ID Any more questions Yes, please So the question is about CV like known CV is in images, right? And how to approach it? So I'm I'm sure that in like atomic command We have like in Fedora and rail which we use to manage like containers So there is a command like to inspect an image and see if there are like any vulnerabilities inside So like one approach is to like scan the image and then it will show you all the erratas and CV's which are inside And then you can either like update it yourself or get the new version of the image. So, yeah, that's That's one point and again like This is what I was talking about when I was talking about content like pick the base image Which you really trust like if you trust Fedora if Fedora is like that good that whenever there is CV There are new updates within day. So, yeah, pick something like that And if you are if you pick something else, which when there are CVs, they don't just care or they will release new images in months. So So, yeah, you can either scan or like pick a trusted vendor So Kristoff's command to that was that in redhead container registry We yes, sorry redhead container catalog. We mark images as deprecated if they contain some CVs Yeah, and if you find such thing you can still rebuild it and get latest packages or or like latest package to the Where the security of vulnerabilities in? So we have time for just one last question Yes So so the question is like what's the way was the best way to build images like if docker files are good Okay, that's a Okay, so I think that you deserve a grand price like for that question Yeah, like docker files are used most list. I mean everyone is using them like So the question was like if it's best practice or worst practice to use docker files actually Yeah, so they are being used like everywhere and So I hope that over time they will there will be many options to like other options to use it For example the ansible container as you mentioned and we'll see like maybe in one year. There'll be something better I but right now We have to use what we have Yeah, and that was last question. So thank you for attending my presentation Yeah, thank you very much and please don't forget to post your feedback and writing for this session. Thank you