 Welcome. I hope you had a great lunch. My name is Onza, this is Peter, and we will tell you something from our everyday life, something that you might find interesting. And maybe handy if you will get to the position that we are in. So the goal is basically to share some of our experiences, and we will see. So what is this container-maintainer thing? I will ask first, like, who of you is a package maintainer, RPM package maintainer of some sort? Yeah, some people. So the world is changing a bit, right? You probably heard about containers. So even companies started to use containers even in production. And also distributions start to work on containers. We just had a container-seq meeting. We discussed a lot of issues and ideas. So even in distributions, so those folks who maintain some package in Fedora or other distribution, then you will probably need to somehow work with containers at some point as well. And when we speak about container development, it usually consists of two things. Like, first, you need to prepare the container, and then you need to take care of the container for some time. So this talk is mostly about the second part. And we will see what are the challenges. So first, more about what it really is, the container-maintenance role. Let's start from what we already know, the RPM package role. Package role, it's mostly about taking the source from upstream, productize it, like writing the spec file, fixing bugs, and basically take care of the package and making sure that the package work with the rest of the system, and it works well as expected. Surprisingly, the container-maintenance role is a bit similar, but there are some differences. So we also take care of bugs that we have reports to, and we also somehow productize the software stacks or software packages. We try to make them as nice as possible. Instead of RPM spec, we have something we call Dockerfile. I didn't see Dan here, because I would like to ask him what we should call Dockerfile if not Dockerfile, because we are not supposed to call Dockerimages, Dockerimages, but they are container images, so that's something we need to still solve. But one more important thing is that there is no upstream source usually for the container. And when I speak about container source, I not only mean this Dockerfile, but also some scripts, because for example, making Postgre database, for example, to work reasonably, and I mean reasonably, like in some environment like Kubernetes or OpenShift, it requires some scripting. We have a couple of scripts for each image that we maintain. So these scripts together with the Dockerfile, that's what I refer to when I talk about the sources. And there is usually no upstream for such sources. You can claim that the upstreams usually maintain some Dockerfile as well, but they have different use case usually. So what our team does, we maintain these beautiful packages for OpenShift. OpenShift is our main platform. We maintain those scripts and sources for several platforms, Rails, N2S, etc. And for each of these, there are more versions, we call them streams as well. These are usually major versions, for example, Postgres version 10, version 11, and so on. All our sources are available on GitHub, so you can take a look. And now more about the challenges that we meet. So when you speak about container images that are part of the distribution, and they are basically solving the problem that we need to ship some software to users, these are a bit different than the container images that would, for example, somebody produce in some company. Because if you produce some container image for your specific needs, you can write all the scripts as you need and you can have just specific use cases for your specific application. But in case of distribution images, we don't have a specific use case for the image because the image can be then used in many, many use cases. So the role of this developer of such image is to make sure that the image is flexible and it means that it can be configurable. It's surprisingly not that easy in case of Kubernetes environment, for example. We should also ensure that the image can be changed and ideally without forking the whole source code, so ideally the user would build a thin layer on top of it would just add some new files. And also, we can think about the strategy in OpenShift, which is very popular, the source to image strategy, so we can support this extension with this strategy. So this is, as I said, this talk is not about how to write it, so if you want to know more, there is a nice write-up by Alishka about how to create a flexible container. Another challenge that we need to solve and it's something I already mentioned, that we don't have upstream for these sources, so the solution is obvious. If you don't have upstream, create one, so that's what we did. We could create one repository for each stream, each platform. We did not because it would mean auto-depication. We also don't use branches. We have only master branch for the sources and it's because it would still require us to, for example, change one script on three places if we have three streams and we want to somehow limit this duplication. So we have one repo with one master branch for each such stack. For example, for MariaDB, we have one GitHub repository and all the versions and platforms or sources for all these streams and platforms are there. We have the PR that is also running for each, sorry, CI that is running for each pull request and we test all the combinations at one point. So you can see that, for example, CentOS is one item here but it really means that it's testing all the streams on the CentOS platform. And now the most tricky part, the duplication and how to get rid of it. We have basically two solutions that we use, some maintenance prefer one, some out the other. One example is that we use SimLinks and the idea is that all the scripts are in the repository just once. We use some SimLinks for the scripts from the directories that are usually the streams and for platforms, we have different copies of Dockerfile. So you can see that there is still some duplication. The Dockerfile is duplicated and also there is usually some readme file that is platform and version specific so we have also duplication of this file. But all the other sources can be there only once. And in cases like this that we need to do something different on some version, we just use some condition. And the variables that are specific to, that will enable us to do such conditions are defined in the Dockerfile. So the first part is the Dockerfile environment specification and the second part is an example of the script that, oh, there is something special to be done on MariaDB, which is bigger than version 10.0. This is, as I mentioned, some disadvantages like it's still, there is still some duplication. So from the engineering point of view clean solution is using a small tool written in Python and it uses Jinja templates. So we can have only one copy of everything. And all the things that are different on different platforms or streams are done using, are basically generated using this tool. And I will try my luck and show you a very quick demo of this. So we have this container, the repository. You see that there are no specific, you know, platform-specific Dockerfiles. There is just one copy of Dockerfile which includes these Jinja templates, right? These are Jinja statements or how to call it. And if I run make generate, it will generate all the combinations and, you know, now we have all the sources without any duplication and without any conditions in the code. So since now it was all about the upstream. And now to Peter and he will tell us something about how we do it in downstream. So now that we are moving downstream, first what we need to do is actually get upstream sources into downstream repositories. So Hans has shown you our two ways which we have the code layout in upstream. And since the, this gets synchronization between those two layouts slightly differs, we need to first check what to do in each of the cases. So in the case of Simlings, it's quite easy. You have there three Dockerfiles which are generated, they are not generated in here. They are written by the maintainer and each of them is for different platforms. So you have Dockerfile for CentOS, Dockerfile.rel7 and Dockerfile.redora for Fedora. And there are Simlings to the root of the Git. So what you need to do is not just copy them into this Git, but you actually have to remove the Dockerfile that you don't need. Rename, for example, Dockerfile Fedora to Dockerfile. And of course, follow the Simlings unless you want dangling Simlings which, believe me, you don't want. So that's the Simling approach. If we use this gen approach, it's slightly more difficult since you actually need to generate the sources before which Hansa has shown you in the demo. And once you have it, it's quite easy. This time around, it's actually a copy of the upstream content into this Git. Everything should be there in the places that are expected unless you generate Simlings which, please don't. And once we have this Git, well, once we have the sources in this Git, all that is left is that we run Fed package container built and that's all of our maintenance work done, right? Well, no, because the container images are static bundle. So if you leave them hanging for too long without being rebuilt, they will end up with CVs. And when you have CVs, you have users complaining about having CVs in their container images which they run. So eventually you need to rebuild them. So how often do you want to rebuild your images? Well, the answer is quite easy as often as humanly possible. But that's not really that easy when you have multiple images which you maintain. So, for example, we in REL7, we maintain 38 images. In Fedora, we have like 15 images per Fedora release. So it really doesn't scale with higher images numbers. So we need to rebuild them. Well, let's talk about when we rebuild them. As far as I know, there are no policies regarding this in Fedora or at least I haven't found any. If there are any, please let me know. In REL, what we do is we have two use cases for when we rebuild images. One is when we do CV rebuilds which is get them as soon as possible since often the CVs are very urgent and our users really complain loudly about that we have CVs in our images. And the other use case is basically a more specific use case of the first which is when the base image changes and fixes CVs in the base image. The base image changes every six weeks. We usually run the CV rebuilds every three weeks including the base image change. So in REL, we do image rebuilds basically every three weeks. In Fedora, all the images were rebuilt every five, every two weeks. But that is no longer the case since we got body integration now. So every image needs to go through body update and testing and regular workflow, same as for RPMs. But we still need to get some automation in there. So what automation do we have available? Well, there isn't really that much automation in Fedora right now. But the future is looking bright. One automation tool is FreshMaker which is basically a tool that checks for RPM changes and when there are RPM changes that are important enough for your container to be rebuilt. The FreshMaker tool basically runs the build, gets the build stocked into a body update. When we have body updates integrated with CI and getting the body updates pushed on success, then it will be shipped to our users. The other useful feature is chain auto rebuilds in OSBS. OSBS is the system that actually runs the builds in Fedora. So we have a chain in Koji. If you want to run something more about image builds in OpenShift, which is the OpenShift build service, then please take a look at one of the presentations that will be happening in a few hours. Not sure which room it is, but it's definitely happening today. So what chain auto rebuilds are in OSBS is that whenever a base image that you are using in your Docker file gets rebuilt, the OSBS will trigger a rebuild of your image and if there are any images depending on your image, then once your image gets rebuilt, then all of the other images will get rebuilt and eventually all the layers of images will get rebuilt. So you don't have to trigger all of the image builds by yourself. Now the last bit of automation we have available is through user space containerization bots. You might have been able to see some of the bots already since there was a talk, I believe it was today in the morning, about Zdravomil. Here we will be talking about Bitka, which is the bot that handles GitHub to get synchronization. So basically, there will be a FedMessage event coming from let's say GitHub and Bitka will pick it up that there is a new commit from the upstream repository. The bot will then check what changes there are, if the changes are for the image that is monitored and it will fire up a pull request downstream and the maintainer can take a look at it, manage it or cancel it if they are not interested in this specific update. Unfortunately, Bitka is not yet open source, but last I've heard it will be open source soon, so once it is, please check it out. Now the last bit of tooling is for when we don't have any automation for some of our use cases, so basically when you need to trigger the first builds or where you need to just check the build status and you want to do it for a lot of the images, so when we are doing it, these manual steps for all of our 38 images in Rails 7 or 15 images in Fedora, we will do it through this tool, since it makes our work slightly easier and the main feature that it has is that it uses Git content, that it provides the Git content synchronization I've talked about at the start, so it doesn't matter if you are using Simlinks or this Gen inside your upstream repositories, the tool will handle it and the downstream repositories will be as clean as you want them. Also, it has some bindings for creating image builds and getting image information from Koji and the future feature will be body updates if it's needed, but I hope it will be handled automatically by one of the automation tools I've mentioned previously, and that's it for me, I guess I'm on time, so we have some place for questions. And I have one question for you. How should we call the Dockerfiles? If you want to avoid the Docker thing, what? Container image files. Container spec files, maybe. So any questions from you? Yeah. Yeah, so it was a longer question. I should probably repeat it, but I will repeat it with a shorter version. So there were mentioned two ways how to extend the image or maybe configure the image, for example, or you already or postgres was mentioned. So one thing is that was in the slide source to image strategy. Another thing is configuring to using Kubernetes config files. And the question was which one use when and to know what are the differences. So the concept that is described in the blog post by Alish Kai I think mentions both ways. So it should be also, and both concepts use the same approach. So we have some directory, basically, if I put it very simply, we have some directory inside the image where we expect some files, either scripts or configuration files, and we do something with them. We either, you know, if it's scripts, we will basically run them at some points, or we will use those configuration files that they are used. And so for configuration files, it's probably more often that you would use the config maps. And for scripts, I would say you just prefer more the other way. You know, if they want to build a thin layer on top of our image and distribute it, for example, more into more machines, then it would be maybe easier to build an image and using the source to image strategy. And maybe the real question is whether you want to produce a new image based on our or not. Because both ways are possible, even for configuration or extending the script. And it's about your idea. If you want to just run the image, maybe it's not necessary. If you want to run the image on a thousand machines, maybe even in different clusters, then you would probably prefer a thin layer and build a new image using the storage strategy, I think. Any more questions? If not, then thank you very much. Thanks.