 Let's start. Hopefully it's recording. We'll see. So welcome. I'm glad that there are some people here. Quite a lot. Thank you for coming. My name is Anza Horak. I'm working in Red Hat. I'm involved in things from containers through Python, Ruby, databases, and also containers. So this talk will be about our experiences of containers development. Because what we did in the last one and a half year, maybe, we had some nice set of collections. So generally, packages, I won't talk about software collections much in this presentation, but you'll be interested. You will see them. So just mentioning that those packages were self-proclaimed collections in the beginning, we needed to produce containers that were fine for services, like platform as a service, especially Kubernetes, and also OpenKit. But at the same time, these containers should be usable on Bernato, on a classic machine, or even at a Mac platform. So this is about sharing our mistakes with it so you don't repeat them. And I would like also to propose some things that maybe we'll get to Adam's guidelines, also maybe some things that I would like to make like a standard way of making containers in Fedora. So let's see. So we will first take a look at the Postgres container and then some kind of a different container, like Python container, some system containers as well, and tools containers. You'll see what I mean by that. Maybe we'll have another opinions. So I would like to hear them. And if you have any questions during the talk, just ask. So if I talk about containers, I usually mean dockering practice, because that's the only experience I have, but I think most of the stuff will be for other implementations as well. And also some of the examples in the slides are not fully working, because just for the sake of being simple, but there are some links in the end and during the slides where you can find working things, because there are a couple of repositories from Docker Hub under the SCL org namespace, and you can find the sources for the containers that we produced so far, and they are working fine. So first, some very basics about the containers. So who never ran the Docker command? Is there anybody like that? No, it's you. So yeah. We will be quite quick in this session. So the containers, one most important thing is that it's important what's inside. Because it's connected to the design of the Linux containers. Comparing to the virtual machine, I see Dan here, talking about DevConf one and a half years ago, and we're standing and repeating containers starting with the virtual machines. I remember that still. It's kind of a virtualization, but there is a big difference in the security point of view, so that if we have some issues, a security issue with the application or in the kernel or both, in the case of virtual machine, the attacker is usually stopped somewhere. The hypervisor, while in the containers world, it's quite, it's possible, not easy, but possible to go through the host kernel, which is shared with the host machine and also other containers. So the attacker can not only influence the other containers, but also influence the host itself. So yeah, container is not a virtual machine in that point of view. So let's see what it means to build a container. So a very simple container. We need first to have the docker installed and running the demo, then we can pull some base image. Now some theory about the containers, the base image is usually produced by some distro. It's a very small variant of the distro, so we can find basic libraries, GLPC and stuff. And on top of this layer, we built some new layers. So for example, a layer with Apache PHP, then we can create another thin layer on top of it with, for example, general-purpose WordPress application and what's important here, and it's quite nice that if we have GLPC, for example, in every image available, it's stored on the ones because it's shared in all those layers. So the layers, there's a shared piece of them. Yeah, they are stored on the ones in the hard disk and we only store what is the current in the entire buff. Then the user comes and pulls the, for example, WordPress image onto distro and creates its own flavor because distro won't be able to create any specific application for its purpose, configuration, or for example, ten bytes or some initial data. So it's also something we need to pay attention to. So in the end, we end with many, many containers produced by distro and there will be another containers that will be created by users to fulfill their needs. And that's quite important to realize that from the beginning that the users will also build their own thin layers. So back to the example of a very, very simple container, we can run the base image and then running some commands to create some content, in this case just the one file, and commit such a container. So this is not the actually correct way to do things because such a build is not very repeatable or reproducible, so we can do it better by writing the docker file. So the docker file is something like rpm spec file for rpm, so it's just the recipe for creating containers. So doing the same as before using the docker file as simple as this and docker build actually creates the container or container image. So let's see something more usable. Let's create the post-dress image. So we start with the installation of necessary files. So in case of post-dress, it's of course the post-dress server image. Sorry, package. So we use dnf call and we can build the container. So we will see a lot like this and what is important here or interesting is that every command in the docker file creates its own layer, like the intermediate layer that we see here and there. So it's good to give the number of layers as much as possible. So we have in the end small containers because if we imagine it like git commits, if we store only differences between the layers, it will be much bigger image in the end if we use a lot of differences rather than if we use the necessary number of differences. So what we can do better is to create, to use all the commands connected with h-hat. So it's only one command, that's one thing. Another thing is we can use some dnf options like no instying documentation. I'm not sure how much space we can save by that but why not use it, right? And what is quite important is to frame dnf cache because if we undo it, or if we do it in another step, it will be still stored in the resulting image. So this is the correct way to do it. So we can build it once, we can build it twice with a slightly different thing. This is just naming the resulting image other than the hash. By the way, hashes are the unique way how to identify images in the local world. And we can see that the look is quite different now and we see this tricky thing using cache. So if you are building images yourself, be careful about cache because it can happen that you want to fix some vulnerabilities that is fixed already in RPM. You want to rebuild the image and if you forget about the cache, you can end up with a container not updated. So use either no cache too or you can use some tool for rebuilding or building images like OpenSheet build service or just OpenSheet itself. There are a couple of possibilities or I think Tomasz will be mentioning tomorrow a few tools to build Docker containers. And you heard today the presentation about OSBS OpenSheet build service by Adam Miller. So another thing we can do is squashing images. So it's again very similar to Git. We have a couple of commits and we want to produce the same output with just one commit so that's the same purpose here. Yeah, that's connected with these intermediate layers so instead of having a couple of them we can produce only one layer directly coming from the base image and the base image here is the Federa 24. So how we do it is to install one package called obviously Docker Squash and using one command where we specify the base image and the image we want to squash to produce one thin layer. So yeah, that's another thing how we can make the containers smaller. And what do you think? Is that all? Well, actually we have Postgres in the container but I wouldn't call that container or Postgres container because if we talk about containers we usually want to have some service running, micro-service. So let's do some clever thing when user starts the container. So there is the concept of default command. If the user runs the container with no arguments we expect that something meaningful will happen. In this case we will run this command and it's actually the script that we added online above. So that's the way how we do it. The script can look like this in case of Postgres and this is the example of the simplified version. It's not all we included there now but in the principle we just initialize the database create some configuration, make the model as some in-convenient work devices all in-convenient work devices and then executing the Postgres itself. Why executing and not forking? Well, there are a few reasons to it. First is that the signals go directly to the process. We also don't have two processes in the container, just one. But it has some consequences like some processes, for example, don't gather zombie processes after forking. For that purpose it's possible to use system D in the container so it can look similar to this. But again, such simple example just doesn't work yet. I'm not sure why. It's probably just system D is not that container friendly. I don't know much about this but just mentioning that is something like that is possible. So why we would like to run the system D inside the container is exactly as I mentioned, because the zombies can be collected properly. We can work better with urnally and it has some negatives that it's not that easy to use right now and I think it will need to run as a route. I saw some system D guys here so maybe they will have some updates It's more... What's the idea about running system D in container whether it's something you are working on or it's not on your plan? Right now you can run it. If you install OCI system D or again OCI Redition Machine. So it's on the... Okay. So what's the trick you need to install some other package? There's two packages called OCI system D and OCI Redition Machine. I will hopefully write it a lot with the movie. Great. So does that mean it's all in the container? No. Okay. So thank you. That was what I was hoping to do. And I was mentioning route access. So it's connected to the design where we share the kernel with other containers and whole system. So even if there is some problem in the kernel it's still better to run the process as a non-route because if you have permission you don't need you can make more troubles or make more problems. So another thing we should do is not run the container by default as root user if possible. We can do it like this just specify default user container runs as user can always use a specific user ID to run the container so it can be overloaded but default should be non-user. And yeah, we can build the container we can call it during running it so we don't need to use the hash for working with the container. So then when we already had the container running we need to know where is the running, right? There is an internal IP address of the container which is how we can find it and we can connect to the demon. So yeah, we are almost there. Why almost? What's the password? Anybody has the idea? Right ahead. Right ahead, yeah. It's not right there. Actually, yeah, one thing would be to use some default password but I wouldn't do that. I know admins they don't remember to change the password after deploying the service. So this is the way how we use it right now we use environment variable which is specified by user when running the container. This is the running script that we saw before this is just the changes we did. So first some more configuration to the PostgreSQL config then during the container start we start the demon but only listening on the local host or sorry the local port or the socket then we change the password of the PostgreSQL user which is like admin user of the PostgreSQL database and then we stop the local running demon and continue normally and in the end exactly Postgres properly listening on all sockets. So we can run the Postgres again we can connect to the demon and now we already know the password because we specify it with the environment variable so we can connect and it works. Well, another thing I saw recently is how to because that's quite a tough question how to pass the password to the container because if we do it like we saw before the password is stored in plain text in the docker demon so if we do for example docker inspect we can find out the password which might be sometimes usable but sometimes we don't want to at least in the plain text so another way how to pass the password to the demon is passing some file as a bind mount to the container and obviously the container needs to support reading the password from this file I'm not sure which way is better from the secret point of view supporting both ways and user may decide that maybe something can do and in the future I hope there will be some more sophisticated way like having a service which will be running somewhere and a container would just ask for some, for example, token I'm not sure, I'm not a secret expert here so I will let these two no-secret experts The better way is to actually use secrets or something through an orchestration system Okay, configuration because if you speak above databases we often need to configure it because nobody likes to run default configuration database so for example, configuring max connection limit this is the way how you do it we just grab the value from the environment level again and store it through the configuration that's how it looks in the container this is how it looks from the user's point of view the user is just required to specify if he wants to change it, otherwise it will be the default one this way it's nice because it can look the same on the command line, on the atomic host, in the open shift so the experience is the same so when we decide what option to support this way and which not, well, we can just look at what are the most common options for a particular image or we can just let the rest of the staff to use it so users should be able to create a thin layer as we speak at the beginning with their own configuration for example, some more complicated master slave replication you won't be able to do it for them because they probably have some needs so it should be just easy to do that this is the thing, or this is the implementation we are thinking about right now that there is a place where a user can bind mount or add some files in the thin layer this place is usually able to get the... the path is able to get through using the environment variable so it's not necessary to hard-code it and the file can be as simple as this one-liner, again working with environmental variables of course there can be some hard-coded things and the configuration as well depends on the user and using such a simple local file the user will be able to create a thin layer that's the idea we had recently so the full working container with the Postgres SCL is available on this address so it's free to take a look and now let's take a look at the Python container because it's kind of a different kind of content right so it's not a service as... at least from the first site so again you can start from the very basic local file again going from environment 4 and starting the packages anybody spotted something missing here? yes, exactly, in the cache and... well, we are already right we have Python inside, we don't have anything like configuration for Python, it doesn't make sense much we don't want to run Python interpreter or maybe we do, but what user would do it such a container he would probably create another thin layer every time because he needs to put an application into the container so what he would need to do and he can document this to write a thread like the profile I see four lines there with some installation script well and... yeah, it can take how much? 10 minutes maybe to write a thin layer and build it so in case he needs to build 10 applications in the... two hours maybe so let's help him a bit so this was the question how to help users to do something like that more efficiently, it was the question that OpenSheetGuys was very facing and they produced the tool called SourceTo SourceToImage and the tool does pretty much it it gives the source of the application to the image and it can be used like this and this example does pretty much the same as the previous one it takes the application from the path into the... it gives it to the container for federal Python 5 and it will call the resulting image as yes and we can run it right after it so what's happening inside how to produce a Python image that supports this because in order to work fine in order to make the SourceToImage work fine we need to do something inside the image so what it is so what the SourceToImage tool does is it boosts or just gathers the application whatever it is it may be get a repository so it will pull it it may be a locally available application so it will bind it and it will put it into some place TMP source in this case and then it will run the container with such an application in this place and run the SMY script and set another script called run as a standard and as a default command in the resulting image and then the container is committed I mentioned in the beginning that docker commit is not a way to go because it was not reproducible but in this case we see that we can reproduce it as many times as we want so in this case and I think docker container is like fine how the scripts look so first the SMY image as I said this script is run inside the container inside the Python container and you see that there is the application copied from the known directory to the actual working directory and there is some specific thing to Python done because that's one thing and we should focus on like if users like to use requirements TXT for specifying requirements of Python packages let's just use them and install dependencies during this step instead of the user would need to specify some install script by himself so there is some heuristic done there is more stuff done actually in the SMY script on the GitHub but this is just the principle that we can do some estimation, some guesses the same for the run script for example you can take a look whether the application is actually a jungle application and if it is we can find the managed BI and run the managed BI what is the standard way of running Python just Python and some script and as I said this script is run when the resulting image resulting container is being started as I previously show the guest book as a resulting image so when a user runs a Docker, run, guest book this script will run it's good to think to use the frameworks that users like as we saw with the jungle example and this is the final example just did a repository previously there was a locally available application so it's as easy as this so it's good to think about it for example for other frameworks other languages you usually provide something like that but if you think about the databases as an application so we can think about the configuration as a source code for the application because if you have a database and the configuration together it will make some particular deployment for example so we can support the source to image also for the databases containers as well we would just put the configuration into the resulting image in a similar way again more information available we can see more and let's take a look at another kind of container so I call them system containers I'm not sure whether this main space is still free I mean if anybody will understand it much differently just tell me because I might need to change the wording so what I call the system container is a container which is running via system D on your machine or your laptop or Bermuda doesn't matter just instead of the process so we just replace the process via the container why we need to do it or why we want to do it we can just slowly moving from the traditional system to the containerized system so we can move one service by another this way or we can just try to use some random demon and we don't want to upgrade our system another reason I can use this approach and how we can do it so we need for us to create the container but we use Docker create in this case and not running the container we just created and it's stopped after this then we create a unit file still unit file for such a service and instead of starting the process directly as usual we just use Docker and start and in the end we just you can just work with the container as expected so again it's possible it's working quite straightforward but again it's quite a lot of stuff user would need to write all this so there is a better way to go it's called Atomic Command and the Atomic Command is not only used for working with container it's also used to work with the RPM Austrian system like it should be the tool number one in this kind of system but it can help in this case so what we can manage to do with the Atomic Command line tool we can do all the first steps I mean the creating the container and creating and enabling the system to create one command only so users won't need to write the unit files by themselves it will be all packaged in the container and using the atomic labels the container would just support this kind of calling and during this call everything will happen automatically it's too long to go deeper but I just wanted to mention that it's something like this possible and it's working quite fine whether we support the atomic call in our containers well not yet and it's because it's not very usable for the application containers that's why this is important in the system containers section so yeah that was our resolution like a year ago so yeah because we didn't see much point to let users to run the application containers with the atomic command if the docker command works the same well tools containers it's not containers tools it's not these are not tools to run or manage containers these are containers with tools inside so it can be for example tools to manage demons obviously the demon with database should be put on the necessary spot to run the demon but we for example want to work with the database more efficiently so we need more tools so we can package all the tools on a separate container and when we package something into two containers it's always tough point to connect to those to these two containers so in this case it's quite easy because we have the demon running on some network device so we can just connect to it and we can work with the utility just as a normal normal yeah quite normal like we can also run Bash here and run the MySQL in a Bash so yeah this is quite straightforward but there are more interesting stuff you can package into containers for example the build tool chain so you want to build something environment of the container you can do something like that package all the building tools into a container and then either bind mount the host host into the container so it's accessible or how it could be done differently yeah you can just bind mount the for example just the application directory you know the tools from the container can access it or we can also run some system tools into the container so for example the performance tools can be or just package as a container in this case it would be quite challenging to run the container appropriately because these tools need some special permissions and system capabilities again I don't know much details but what I know is that these such containers we want to run for example the performance tools need to run as a privilege and even maybe super privileged so it means run for example a quite long command line or again the atomic tool can help here again the container with such tool can have the atomic label set and the atomic command itself already supports the spc option so it can help here as well automatically so this is how it looks in case of atomic call it works or it does the same thing as the previous command you see this that it was not even all I don't want I lost mouse sorry does anybody saw my mouse alright and another thing quite weird from the first side but it's possible to run desktop applications in Docker I'm not sure whether something as easy as simple as this will work but well I don't say you should run it you should try it because for desktop applications there is a better container technology you might already heard it today it's also designed for running desktop applications in containers so rather than hacking something like this I would try to suggest you to look at that by the way as you're wanting to block everyone you're answering okay alright and some unsorted ideas in the end because building containers we realize that it's not only about writing the Docker file itself it's also about the infrastructure because when I talked about the design for example in case there is some security issue in jealousy it never happened right except February I guess so what happened was that we needed to include the security fix in the jealousy so in the base image it was quite easy to rebuild such an image but since all the other layers are sharing the jealousy image with the base image we need to rebuild all the upper layers and again for the users containers they also should rebuild their containers after there is some update in the base image you can imagine that there will be a lot of contents it's actually case so but still we should do it because once we forget for example about something it's not only makes the container broken it will obviously mean that all the upper layers are broken as well or at least learn about learn about so we should automate it that's the thing we learned quite fast that it's not possible to do this manually so we should automate the building the OSBS can help with this because it will support automatic rebuilding but again I mentioned that it shouldn't stop at the level of distro images we should also have some utilities or some service to rebuild even the users containers might be copper might be something else I don't know and when we do something automatically we should also test the result and that's also one thing that we learned and I'm very happy about having some automatic testing there because when we rebuild something I mean in the Docker world it's quite easy to test the resulting image because the Docker itself is quite a fine tool to run the resulting image it's not like in the RPM world it's not easy to install the RPM on the same machine as we build it for example because of the system need we need the route for it for the Docker it's quite fine to run it so this is the example of the repository as I would like to see it once in the Fedora like in the diskit for example that except the Docker file and some supporting supporting files it would also and always have some test together together with the image source because if we want to store it somewhere else there would be some it would be quite tough to have it in sync because if we change something you know we would have to change the test and trying the test would be as simple as this right after the building we could run the test script the test script is not hard to write this is a very short very simple example how to write this test for the for example mysicolemon so in the principle we just run the result in the resulting container we find out the network address and then in the loop because the demo starts the starting procedure it takes a few seconds so we wait until there is some something very reasonable happening and if not we just print logs because it's quite common and if you want to have at least some idea what is happening it doesn't work what else we can check in the script maybe even generally somewhere else if this doesn't need to be necessary in the container specific test we can check whether all the packages are signed we can check for example labels if we expect them to be present we can check whether the image uses the latest based image we can check documentation error supported during during the build in case of software collections we check whether the software collection is enabled by the file and we can also check the API and now what I mean by the API so I'm not sure again whether someone would understand the container API same as me so what I understand is the container API is everything which other users or consumers of the image can depend so we identify that it can be obviously something that they directly use for creating their own containers for example the path to the script which adjusts the original container the path to volumes because they use it in their script type or configuration it can be ports or at least the default port that the service is running on or even the default commands or actually all commands that are available in the image and we should not only keep those unchanged but think about the API and if there are some changes needed so at least think about what are these consequences for users we should also try to make all the images the same because well, obviously for example MySQL database and MariaDB database it's usually run the same like at least in the RPM world so it makes sense to use the same approaches here as well so things that we identified and we try to use the same in all the containers we produce is for example using USR instead of user local for example paths we try to use the expected paths which are used in the RPM ports again for the service and the same works for the default user but for example for the Python container there is obviously no default user there is nothing like Python user in the system so it just started to use 1001 from some reason I don't recall one of each but again well it doesn't matter what it's actually chosen but it's good to think to use that in all the services consistently and for that purpose there was created the container best practices guide the source of the guide is on the github so it's very open and it's waiting for some cool requests and at least well you can take a look what was so pardoned fight is the best practice when creating containers so that's all I wanted to share just a quick check does anybody remember some of the tips at least one don't know if it's gross yeah don't know if it's gross don't know if it's gross thanks few that I like and I I would like you maybe to remember if you remember only some of them so yeah content is important don't use root allow users to extend the images because they will do it anyway support atomic command line tool and automate so there's one issue with not running processes root inside containers volumes can you expand can you expand only the root inside the container actually has any permissions on any mounted volumes that you bind onto the container and if you don't start the container as root you can't change those permissions right yeah so if the process you want to run is non-group requires specific permissions on a directory within a bind mounted volume it doesn't work right but still you have to change the UID of the files you're mounting into it so here's the example it's been in 1001 you've done show in 1001 call in 1001 whatever you're going to want well we have to do that on the external on the host yeah that's the way we use and there is the username space available now so if username space it's not going to kill them so what people usually do is actually they have like an entry point where they dynamically change UID of the user running inside a container based on maybe something passed as an environment type and they would use a special tool like GoSoup so that you can create an extra process if you would use the ordinary GoSoup except that we don't have GoSoup available as a app or so okay if you use the atomic install command you can build this mindset to change the UID outside on the bind you can mount it handling volumes is never going to be perfectly in building yeah but in many cases it's not necessary so in many cases you can there is a patch to change the UID and that's stuck in the Docker world for years so so what's the call to do call to you okay thanks for the comments yeah if you want to download the presentations already available I just forgot to show you our code and any other questions we have no other thank you sorry but I still had a question why would you want to test in the results sorry I probably explained it wrong no in the resulting images but in the same directory it should be included in the process it should be excellent actually a good idea but you don't put it into the container so a better tip would be how do you figure out a way to share a different make the test run up to the images so you always have to get it to move from the results I think it's important to be great because for example for the Python example we use the source image from the host so we need to be able to be able to install the source on the host the thing is we don't have we don't start the test we don't have the just the Python mechanism so it's probably good at running if we would have some spare of what this is about find a way by the image I find or usually I have an image name and I make a revolution that's the program I guess it's always the same in our case so it's quite easy to run into the same language yeah, the thing is different containers use different types so for the source we're using a separate for separate refugee test for this stuff we use the word separate actually just twice not to run up to the image so it's