 Hi, I'm Valentin. I'm engineer in the container engines team at Red Hat and My talk so when proposing the talk. I thought that the community already knows potman At least we're blogging a lot about it. It's also now in rel officially for for some time So I wanted to dedicate this talk and talk a little bit about the things that we usually don't present in the talk because We introduce the tool or the tools we talk about how new users can approach it or users coming from Docker or other container tools can migrate to it so the the scheduling of the talks is Not ideal in the sense that the follow-up talk will be how to replace Docker putman with Docker or Docker with putman I'm gonna give the next two talks as well because unfortunately, Dan Welch didn't make it He had some some problems. He wanted to come or to fly via India But for one one reason or another his visa didn't work So all the follow-up flights didn't work. So I'm replacing him. I I hope this doesn't scare you off. I'm not as entertaining as then but I'm a little bit taller with more hair So I'm not sure if that if that counts for something so I briefly mentioned that already and Red Hat or Fedora as much as everywhere else Containerization started with Docker and this is really a great contribution and they've done an awesome job at making it approachable and usable by others and Over time the requirements grew more use cases wanted to be met by customers and also the community so I compared a little bit to a Swiss Army knife and the one displayed here actually exists I Think it costs over a thousand euros and is 1.1 or 2 kilograms So where I'm pointing at is the more features you support with one tool It means the bigger it gets so supporting everything comes at the cost and At redhead or the team especially Dan reflected a lot on about it So when you look at Docker now the situation improved a little bit But it's being used everywhere for all use cases on the desktop on the server in kubernetes Which is not true necessarily for all deployments anymore because there's Continuity which is more targeted. So this argument may not count anymore but it it's basically meant to support everything and especially then set down and assemble the team to work a little bit and reflect on how can the future look like what can we need and basically the Philosophy of redhead is to have no one-size-fits-all solution but to have a set of dedicated and specialized tools Which are based on open standards namely the open container initiative Still being backwards compatible to what has been Existing or predates the the OCI meaning Docker for instance the different formats Docker schema to the image format for instance Everything should be developed in the open right redhead is an open source company and all these projects are Really open for contributions from outside. This is actually how I how I joined the team I came from another company and I like the team so much that I said this is where where I want to work and Also to have a certain degree of interoperability Among these tools so if we really want to have something like this But with a smaller set of tools they need to be able or users and we need to be able to compose them in a way to Really fit fit all the needs So cryo in this case. I'm not gonna talk much about cryo today because it's intentionally Something boring why it's something boring because the only use case is kubernetes Nothing more nothing less. So this is a really good example of having a specialized and that the use case dedicated tool So what it is? It's basically a container runtime For kubernetes. So if you're a kubernetes user, you don't have to worry about this You don't you actually shouldn't care. That's this is why we intentionally advertise it as being something really boring This year in April it joined the CNCF, which was a really really Good thing for the project because it gains a lot of visibility the CNCF is A great platform that also supports the growth of the tools and it's also a certain compliment because it shows a certain maturity of the project and also the commitment of the Core maintainers to the overall community it supports all OCI compatible container images Also including the older docker formats that I was mentioning before It can talk to basically any any container registry that is out there There's also Different container runtimes and here you see a certain redundancy in the term runtime because cryo is already a runtime So what is a container runtime when it's already a runtime? This is why especially Dan advertises the terminology of a container engine which cryo potman docker actually are So they basically take care of the images and they instruct other tools to actually do all the the heavy lifting So their container runtime is something like run C C run Then I think from Google there's run SC also a bunch a bunch of container runtimes Which actually take a bundle and execute the the real container So this tool actually executes the processes that we see on the on the host There's over a hundred contributors more than 90 releases I guess now we are over 110 releases actually and We have actually there's something miss missing so on the 150 or 15 K per PR means tests. So we run a lot a lot of tests There's a huge amount of unit tests of mocking tests And we actually run the end-to-end test of Kubernetes upstream to make sure we're not regressing on something in fact This week we actually found Regression in the Kubernetes upstream tests. So this is really cool. This is also Contributing something back to what we consume so there's a really good thing going on there and Although sometimes on Twitter, it's advertised as only being a redhead thing not necessarily from redhead, but competitors Actually, there's really a collaboration across the industry. So Suze is working actually a core maintainer Intel IBM lived and And Nowadays also since IBM acquired redhead, there's a lot of good things and a collaboration going on. I Guess I need to hurry a little bit more Right. I like I like talking about these tools So so but there's more tools because there's way more use cases and those are the tools that I'm going to present today their scopio which is Responsible or meant for distributing images and managing them it allows for converting them between Different formats. I'm going to talk about scopio At the end of the talk then there's pod man, which is yes That's really a container engine and it's responsible for the container for containers and pods So pods comes from Kubernetes Which is more or less a glorified group of Containers sharing certain resources namespaces may be the PID or network namespace And then we have build up Which is responsible for building container images So all share the same libraries and are developed upstream at the github.com containers project by sharing all the libraries we achieve especially the interoperability part that I was referring to before and Also here. It's a collaboration across the industry. Most maintainers are redhead by their highly non-trivial contributions from outside and more and more show interest into helping maintaining the tools and Also across Linux distributions So I guess the hardest thing is getting a package into Debian because the packaging go tools there is inherently different to what Fedora or open SUSE are doing so When you know are you familiar with go who is who's familiar with go or who is not familiar with go Okay, that's that's good only only few don't know this pain. So in in go all dependencies or there's different different ways to to compile a GoPro program But conventionally all dependencies are put in all dependencies are put into a folder in the root of the project which is called vendor and This means that There's a lot of redundancy which makes sense, right? So potman vendors containers image the image library the storage library build a build a vendors the same and All of the code is statically compiled So from a traditional point of view from a Linux distributions point of view. This is this is a little bit terrifying because right ideally we have Dynamically linked binaries we just update the binary Sorry the library that might might have a vulnerability or a bug so This make makes upgrades from the distribution perspective very easy and It's it's something worth doing and in go the ecosystem You can Norse this mostly because of just how girl would go works go comes from from Google and They just deploy everything at once. So it's it's inherently different to to what we are doing So potman I was talking a little bit about it already. So it's a container engine for managing containers and pots the CLI so the command line interface is Identical almost identical to the one of Docker. Why because it's a defective standard for managing containers Most of the people who are working with containers already know it. So there was no need to introduce yet another thing Which makes migrations? Way more complex. So adhering to the CLI of Docker makes the transition of users and also scripts much easier and And it's developed at the containers lip of project and uses the image library for the image management the storage library for local storage is basically when you Explode the container image on your hard drive It's most likely being stored there and it supports different kinds of drivers or overlay butterf as VFS and and so on also a street and Builder for building images, but I've been talking enough and in the talk after I'm gonna get into the more Basic tasks, but now I really want to show you some some of the features that I love but we usually don't talk about much I want to start with the potman mount and I'm not feature So what it does it mounts the root of as on a container on the host So what what you can do is you do a potman mount then it spits out a path on your host and This is a monpoint where you can access the entire root of as a container So some some folks are asking why not just use a volume, right? volume there you can mount Or basically the volume allows for sharing Files directories entire paths among containers and also the host Because in my opinion volumes are annoying to use when you want to have a lot of sharing, right? Just sharing root usually don't it doesn't work because then you cannot mount anymore It's not really useful to operate on the root FS of the container because it's usually meant the other way around that You're mounting something into the container So it we come from a different different perspective a different angle here and It's generally useful for sharing data, but not for altering data in a in a generic way So why not just use copy in this case because copy would just be a workaround if we edit because we first have to Copy data from the container to the host and edit the data and then copy copy it back With portman mount. It can be as easy as display here, right? We run an exemplary container Fedora 30 then we unshare why do we need to unshare because we are running rootless and We by default overlay. We don't have enough rights to mount by unsharing. We Create a new username space where we then have the rights to do it And Yeah, then we do a potman mount we get back the path which is then in the MAT Variable and if I grab for s release you can see that well, it's a full fedora container image and this is really cool for specific use cases for instance if you have certain scripts or That you don't want To adjust to the specific container images that you're running if you want to have something generic and not specialized in a sense because you might want to run Fedora container a devian container and a buntu container arch container and you you don't want to find out about all these Specifics, but you know exactly the host system that it's running on. This is something something really nice I forgot ETC there in the path. That's that's right So I adjusted a code. Yeah, so there is the ETC missing So we really mount the entire root of this there here in this example I I just screwed it while while copying into into the program, which is generating these nice nice images here But thanks for noticing it. I will update it then Next topic is managing container images. So here once on my local system I had really a lot of things going on because I use it for development. I use it for running specific tools right things can can get messy and Oftentimes I want to clean up, but I don't want to remove everything and but I want to find out. Okay, how how how did I come to this and There's a pretty cool feature which has been contributed by NTP, which is a company in Japan Which is a very very good team Which is contributing a lot in the containers ecosystem So and this is called potman imagery potman in it image tree lists or basically checks the image analyzes the layers and spits them out in a tree format So if we if we look here We first download WordPress then PHP 7 to Apache and if we then run potman imagery on the WordPress image it spits out a few a Few things I'm trying to Use this. I'm really bad Okay, I don't know how to use the laser pointer apparently Well, what you can see is the the image ID tags So all tags that are associated to the top layer of this image, right? One image can have multiple tags. Those would be displayed here in a in a list It spits out the size. So, you know, how big or how much disk size it consumes And then it lists the layers below and there we can see that The WordPress binary or sorry the WordPress image is based on the PHP Apache one so this is really useful to figure out which layers does my image actually use because Quite often if you download something from from the web the dependencies and where it comes from is Not very trivial to find out But it can also go the other way around if we want to figure out which layers require my image Then we can pass the what requires option and then it goes in the other direction and This is actually really really helpful Or at least for me I'm not sure if you use stuff for the same use cases So I was I think I was implementing or working on a feature for the containers image library in some past There's some tests were not passing anymore And I knew there was something wrong with the layers and with potman imagery. I really understood what was wrong So also during development it can can be really really helpful So in a nutshell put my potman imagery prints the layer hierarchy of an image in a tree format It can show you which layers does my image consist of so Basically explode everything there we could find out that WordPress is Using the PHP Apache one and also to figure out which layers actually use me or the image. I'm currently looking at It also matches layers to text. So when you look here, you see the top layer of A line at the end, but this only works when the image is pulled So if the if we don't know the tech we may download the image from the registry Sorry the layer from the registry, but to know the top layer We also need to download the image only then we have the have the knowledge in case you want to use it and ask Okay, I downloaded it Vantine said this is going to be there, but you need really both Sorry, both images. This is why we why we pull both here, too And it can help in understanding the dependencies among the images and it can also really help in image builds So I'm building something Let's assume we have a complex docker file and want to see want to see really what the what the image is Made of or how the layer hierarchy looks like without inspecting the image and then trying to Extract it from there and as I've mentioned before it already helped me really Debug during development The next the next feature here is one of my favorites if I love all of them But if I would choose I may love this a little bit a little bit more Some people think I'm I'm insane because of it, but I really see a beauty in it. So let's get straight to it First we see the docker file, right? We have from Fedora So we base our image on the Fedora 30 31 and then I'm adding a label here So we say label echo label. So this is basically the key everything that follows is The value of the label potman run image echo. Hello flock then we build it with potman potman build and when I execute potman container run label and specify a label like a key to look for and an image Portman will parse the Config of the image look for the labels try and looks up the label that we specified and then Execute the value of the label on the host. So as we can see here. It's In this case running the image and Runs echo. Hello flock there. The problem is run label can execute any command on the host and as you May may think this is not necessarily a good thing to do but I Don't suggest to abandon common sense In any case, we shouldn't download or pull random in images from from the web or from some registry and just blindly trusted and This is especially true for the lump run label There we really need to trust the image and we really need to know what it's going to happen Why do I like it so much? Because it somehow lifts what an image can do usually the image specification of of The docker format or also the open OCI format is a little bit limited and this is intentional because The standard or the OCI is meant to be the sorry. I'm not native the Smallest common denominator, right? It should be something that everybody can use and is as less specialized as possible To not close doors at the beginning. So what the image when we create an image We can specify the commands inside. We can specify the environment variables We can add a lot of metadata as we do for instance with the labels And a bunch of security switches for instance running is privileged I'm not actually sure the app armor profile, but it's it's very limited in the sense with using run label We can add a lot more information to the image. So if we know that a certain image Must be executed by the container engine in a specific way So the container engine potman or docker they have a lot of switches basically that we need to specify things may be mount points May to be an armor profile and as e-linux label second profiles things like that or if the image has certain Requirements on the host where it's being executed. Maybe we need to install a certain package on the host for whatever reason Then run label is a really really great way of doing it because especially in automation because if you know The or if you have a convention for using a specific label The developers or the creators of the image know how the new version must be executed They just update the run label and basically your servers just need to execute the run label and everything will work but again This is something something that should be used with care. Yeah. Yeah, so you can inspect the image So you could if you want to have a look at it before you can do a potman inspect and then use the Format filter or just look at the entire JSON output and then you can expect it. So it's really it's something transparent. Nothing. Nothing is hidden. The metadata is is Obvious at that point. Yeah, definitely Definitely, definitely. So in the end there's there's no magic behind it in theory You can come up with a convention and use it with other other container engines than than potman for instance, but this is just a Way to automate it right and to have Semi-standard at least at least for potman All right, any any more questions on run label. I'm happy that it was a question because I like it so much Yeah Yeah, that's again a copy a copy thing I copied it for from a blog post I wrote for for a German news or IT news Website and I wanted it to relabel it to flock to make it a little bit more appealing, but Yeah, in this case, it should have thrown error. Yeah, all right So I saw a few people running around with everybody loves system t-shirts and well, this is This is something maybe not everybody does but we all have to live with it and we cannot hide from it and One big problem in the containers ecosystem for a long while was this system d support, right? So running and there from we need to see it from various angles We can use system d on the host and use for instance a unit file as displayed below to execute a container With potman, this works really well because potmans the way potman Executes or runs a container is inherently different to docker Potman follows the fork exec model. So all containers are really the children of the potman process Which makes things quite? More appealing It's easier for service management as we can see here because then really system d has full control over it for instance C groups so if we specify certain limits those limits will be applied to the container If we do it with docker the limits will only be applied to the docker client and the docker client will sell will send a remote procedure call to the docker demon and And in this case There it won't won't work Also, it can send SD notify managers and messages So if another unit file depends on the successful execution or start of the container This works as well. Whereas for the docker client We're not sure if the container the service is actually running at the point the client terminates or exits So if for instance here is an exemplary the system d unit file for Redis container. It's pretty straightforward We can Implement a restart always one So when you look at the docker for instance docker does all these things in the demon or has to do all these things in The demon that's just because or a consequence of the architectural choices the developers had which is perfectly fine But it has certain limitations when we want to use it in in a system d unit file To make things a little bit more easy Potman allows for generating those system d unit files. So here We call a potman generate system d flock. So flock is a container I've been been starting before and it Spits out a system d unit file that we can use directly or use as a template for further extension So if you look at the third last line there is First what we see is it's a ruthless container because the container storage is in my home directory And then there is a dot file one thing I missed by By trimming the line is that there is a con mon dot pit file So in case you ask what this is the con mon stands for container Monitor and it's a process sitting between potman and the runtime So potman is not directly calling run C, but it's instructing con mon to run run C then it double forks for The reasons of being able to run in the background and not having problems if potman exits or is being killed And it provides a socket that we can use for attaching. So if you do a potman exec Potman will actually attach to the socket of con one and con one will stream everything out It all it con one is also used to log For logging. So there's two drivers supported at the moment. You can either lock to a to a file on disk or Use directly the system D journal. So then you can do your journal CTL yada yada and look up. What's going on? It also keeps a bunch of file descriptors and ports open Basically, we have to do this if not the container cannot cannot access the ports for instance And it also records the containers exit time and code. So con mon So when saying potman is no demon It's it's factually true but only because there is con mon Because some process needs to watch the containers, right? What what's happening to record the exit code keep the sockets open and all these things and Con one is basically used to prevent potman from being a demon But to be fair at least we believe it's the smallest possible demon It's seventy four seventy six K in now the the 100 release and yeah, it's it's pretty small It's also written in C not not in go So the next use case for system D is to have it in containers, right? If we have a docker file yum install htdpd This requires system D because it's being started by system D unit files If we don't have system D in a container because the container engine doesn't support it because System D is a little bit special in the requirements It has for the mount points for instance It wants run run lock a temp and var lock churnal as a temp of as and it wants to bind mount or needs to bind Mount sys of s c group of system D so that system D can actually talk Then we have a problem and this problem existed for a long long long long while so what What we had to do or what basically everybody was left alone with the task of writing In its scripts, so everybody had to write some Bash scripts to start the services instead of just doing system CTL enable Maybe even in the build of the container so whenever you execute the container the service starts automatically For docker the team wrote a few years ago the system D OCI hook which So an OCI hook is a standardized way of telling OCI compatible run times to execute Certain binaries at a specific point of execution you can compare like certain Steps in a compiler certain pass in a in a compiler and this is happening here here as well So at start and at stop the system D OCI hook was it was being executed. It was setting up The mount points that potman now does implicitly and then did some cleanups as well So with potman as I was already saying now is that it has built-in support for it because This is this is something we need we don't want people to Manually write in its scripts because this is a problem that has been solved already and through us back by a decade or so and We want a container not to be something special that people need to treat in a or what people need to create Or think differently. It's really just a glorified Process on the host. This is why I already see the cool cool t-shirt in the back Containers are Linux and this pretty much pretty much nails it So there's no workarounds need it any more we can just install the packages and this is something I really found find Fight nice, it's not something amazing in the sense that it's innovative. It's just something normal and This is this is the nice point of it One of Them or a common feature that is being demanded or asked for by the community especially by by folks migrating from Docker is if potman supports Docker compose right Docker compose is a Declare declarative way to Start a set of containers and That basically compose a bigger service, right? You may want to start your web server independently from your from your database because it makes sense from the microservice point of view And this is a nice way to do it But we don't why because the Redhead and the core team really believes that kubernetes is is the now defective standard way to do such things and potman compose Does a great job? But we don't want to invest resources into supporting it We're not closing doors. So if people want to contribute upstream They are more than welcome to do it But so far the need hasn't been apparently big enough to Invest resources into putting it or supporting it officially in potman. However, there is a Python wrapper by Mujad Al-Sadi who is very active in the in the Fedora community as well and Apparently this this works. Well, it has been received Super positively by the community Alternatively we Want people or the alternative that the basically potman offers is it supports a Kubernetes YAML file So usually what you throw at cube CTL you can now throw at potman but you don't need a kubernetes cluster to run it you can really use it locally and This is I find amazing because Sometimes we have to debug things or if I want to run a certain kubernetes YAML file But don't want to spin up a big cluster or maybe if I don't have access to it Then I can just use it so it's a bridge between local development and the cloud native world and I find it's really nice so if people have existing deployments that are based on On potman compose there is another way to migrate to kubernetes YAML which is using K on pose so compose Which is a tool under the official kubernetes umbrella which converts Docker compose files into kubernetes YAML file. So there is a migration path between the the two worlds and So here We somehow thought if we can already Read the file why not generate the file based on on an existing container? So you may want to or you may already have some containers running locally. Then you say, okay I want to push this thing now Into kubernetes you can do a potman generate cube on the container or on on the pot depending on what you're having and Then potman will spit out a kubernetes YAML file that you can use So here I had to trim it a little bit It's really just a standard kubernetes YAML. We have a pot certain metadata then a speck of it We can see that a companion the command is sleep with an infinity argument. We have the Environment list the image is the fedora 31. We have a name and it's an unprivileged container I was hiding a lot of information that was I'm not necessarily Important for for what I'm trying to say here, but this is really an easy way As the first line says it's still under development This is not an easy thing to do, but we plan to Basically make this a really stable thing at the moment. It will work for most things. Please don't nail me to it You know famous last words But this is how the team and the maintainers in Vision the future and where they want to sport different kinds kinds of needs It's also so the cool thing about kubernetes YAML is also Nice way To declare what you're what you're trying to to execute so you don't need to have a shell script anymore that Where you you want to do it or maybe even run container run label because everything is specified here You can also just put everything into into a kubernetes YAML file and do the same thing Potman checkpoint and restore. This is a feature that is supported since potman Since the one or release so what you can do is you have a container You checkpoint it and you restore it on an on another machine So it allows it allows for migrating containers among machine Personally, I cannot go into all the the technical details because I didn't implement it at all it's a very very complex thing and Adrienne Adrienne River did this They're using or we're using cryo, which allows for migrating processes a lot of things happen there in in user space But what we can do here is as shown in the example we run a container We checkpoint the container we export it into a tar archive We copy it to another machine in this case It was just a virtual machine that was running on my notebook here then We restore it by importing the the tar archive and then we can start it and it will just start yes the the use case I mean a container export and import All right here everything is frozen. So the container the processes Execute or execute where they were frozen before so yes, you can do You can export a running container also in the tar archive, but what is happening there? It's a very good question. I will remember it What usually happens when you do an export like a Docker export or potman export what it does is it looks at the Current root of s it commits it to a layer and makes an image out of it. So when you start the image it will start it will Create a new process. So it will execute the entry point and then the command with Checkpoint and restore it really freezes the process and Starts executing at the point of the restore or where it has been has been frozen before Does this answer your question? Okay, so Perfect. Thanks so much so much to potman. There's still a few other tools and I seem to talk very slow Some some resources if you're interested upstream development and the community is on github.com containers lipod There is a channel on free node Potman, there's also a mailing list which has been introduced. I guess a month ago or so potman at lists dot potman IO and Also the website potman.io We we try to block there regularly and share resources from from other pages as well And it's available on most Linux distributions I cannot say all because I don't know all distributions But I think you you will see some some major ones here for sure. There's redhead enterprise Linux and Fedora There's also open Suze. I think our friends Suze are also planning to support it in Suze Linux Enterprise, which is a really cool thing on manjaro, gen2, Arch Linux, Ubuntu and on Debian but as I've said before Debian is a really herculean task because all the Dependencies that have others have in the vendor folder have to be put into separate dedicated Depsource packages. This is really hard because not Many tools share the dependencies. So docker potman, scopio, builder And whoever and they will certainly have different versions So you have to package different versions and this is this is really really a tough problem Now let's come to to builder or how Dan would say builder But I can't imitate his Boston accent so well So builder is dedicated for building container images parts of the source code are actually used in potman build so now Trying to tell you why and when you should use builder instead of potman Because it goes beyond the functionality that a docker or potman build have which solely works with the docker file And it's meant to be used as really as a low-level core utils like tool for building container images It's also built upstream on the get up containers project and shares the image in the storage library With the other tools as well Builder supports using docker files with builder build using docker file or just for the lazy ones build a bud Like potman it can run rootless Same architecture than potman no demon besides con one it focuses on OCI standards and on open development as much as the other tools and it's also targeted for towards Kubernetes or the build pipeline specifically so Builder is offered on quay or you can download in a builder image and in a later talk Today, I will present how we can use it Or basically how also redhead is using it internally for for build pipelines There's a lot of cool things we can do with it to to speed up builds and secure them So a common question is does builder have a scripting language perhaps build a file and here I'm Shamelessly copying dance joke. Yes, there is bash And this is build as ultimate scripting language what I mean by it What we can do is we do a builder from it could be any image here We build a new one from scratch then we mount the container which is the same thing as we've done previously for potman It's the same concept we get the mount point of the working container that we use to create an image and then we can do whatever we want on it We can do a DNF and install root on on this mount point And use the hosts DNF and all the repositories on the host and install stuff there. So in theory you can create Images without rpm without yam without DNF very minimal footprint or whatever you want to do on the container Then we can unmount it and then build a commit to create a new image based on the current state of this container So this is really envisioned as a way to create Tools that are more complex that have specific needs And use builder as a very very at the very very low or maybe lowest possible level and build more complex things around it Another cool feature that I like a lot and that we constantly fail to advertise is That Builder supports including other docker files. This is a feature that has been asked for I guess since 2013 or 14 upstream a docker, but the maintainers are very reluctant, which I understand because The docker file syntax doesn't support include and if they introduce or introduce it now older versions of docker will break and This is this is something really hard dealing with backwards and forward compatibility is a very tough tough thing to do So we were thinking about what we can do people still want to use docker files and we still want to use docker files But we also want to be able to include another docker file Why because we have a lot of boilerplate code a lot of docker files if you work with them you might know it there's a lot of Dnf update dnf install then cleaning the cache and most of them use or install common packages you might not want to rebase them on on one another but you because maybe you want to you want to Make one layer out of them instead of squash all the layers for instance, so what we are doing here is we use the C preprocessor So this way we can Include another file we can include any file because in line three you see the include directive Which is basically exactly what we want to do to see preprocessor among many other nasty nasty things It does text textural replacement. It takes the contents of the one file that we're including and copies it at the point there Yes Yes, you can this is why some people say we shouldn't do it, but it's really up to the user So if you like CPP macros You're free. You're free to do it. So we're doing it So this is something that anybody can do you can you can run the C preprocessor on your on your system and Basically use this and throw it at docker as well What build is doing or also podman in this case is whenever a file Has the dot in suffix then we preprocess it before And this is a way to to achieve it Again, it's somehow the same the same philosophy behind we don't want to reinvent something We use something that's already there and the C preprocessor is as old as Unix basically So this is something that you can even install on Windows So this is this is something really really nice and I like it because it's a it's a very Approachable way of achieving the task For build a same as for potman Upstream on GitHub containers. We're in free node It has it's on website and also a list feel free and invited to join and it's available on the same Linux distros as potman Last but not least scopio. So scopio is a tool for managing and distributing container images it's basically the first tool of the github.com containers family and Used and I think it's the most widely used tool not only due to its age because it's older than all our other tools But it seems to be a really serious problem So it's used in many non-docker pipelines to to push images for instance in The the open build service of Susan open Suze. They use you mochi for building container images But then they also need to push them to a registry, right? We need need to make them available and scopio does a very very good job at that and Originally scopio was born by the desire to inspect remote images I guess it was in 2000 2014 or so Antonio Mordaca a colleague from from redhead opened pull request a docker adding Docker inspect command what it wanted to do is To contact the registry Download the config and metadata and display it and the maintainers liked the idea but Still rejected the pull request because they said well, sorry the command line is getting more and more complex and we understand this But they said well a container registry is nothing but a web server. So in theory you can curl everything and Then I told you sit down and said okay cool I'm gonna do it and this is how how scopio was born. So here's an example. We do a scopio in inspect Docker the Docker prefix means we're talking to a Docker registry. There's different transports that I'm gonna present in the next slide On the fedora latest one and then it spits out a bunch of information that we can use for post-processing or just exploring What's going on? What is how does the image look like we also see the layers? So in theory you can write a bash script which also does something like a potman image tree Around scopio inspect. So here you see the degree of interoperability. I was also referring to before because all share the same the same libraries So scopio Supports multiple so-called transports. So when you do a potman pull it uses the Docker transport For pulling the image into the container storage one. So container storage is the local container storage We support different drivers overlay or butterf as in in fact back then it has been a fork of Docker the Docker code We can't use the storage library of Docker or container D directly because we It's because they're not a demon the tools are demon less besides cryo But all the tools are demon less. So when we have to sync we cannot use memory mechanisms like Semaphore or a mutex. We really have to go down to the file system use Filux for it. So there was a lot of refactoring going on and supporting use cases where the tools run in parallel So this is the price you have to pay when you're not a demon It also supports a directory transport, which is a non standard is standardized way to explode an image To a specific directory. So there you can explore it You can check out the manifest of the images and things like this Besides Docker. We also support OCI. So this is basically an implementation of the OCI image specification and it can also be compressed like Docker like Docker save or potman save in as a tar archive and Last but not least there's also support for for OSG So the different transports give a lot of flexibility It works rootless where possible. So you don't need to root to For scopio Certainly there are limitations. So if you want to talk to the docker demon and Copy an image from there you need root because the docker demon Disguise this case requires root and it's a non-opinion way a non-opinionated way of managing images. So there's copy inspect and delete and Very limited functionality. So users can build something more complex around it and As I'm repeating myself quite a lot with it, but they all share the same library So if you do a potman pool, it's basically the same as a scopio copy Docker yada yada container storage yada yada And it's easy to integrate into into the tool chains So here you can inspect for instance the fedora raw height image and just use jq to inspect the fields of the JSON Same as here, but scopio does not have a dedicated website. Everything is upstream on github If you want to reach out to the developers, you can use the containers channel on free note and it's available on Basically the same same Linux distributions, but Debian already has it in the main repositories so I Started the talk with the the big huge Swiss army knife, which does things well But it it has some side effects or some consequences may be security may be you just don't have root on your system So the philosophy of redhead there is to have smaller more specialized tools So you can really choose based on your use case And that's that's pretty much it Do you have questions? Yes Yeah, yeah, that's that's pretty straightforward so the interface is all all of the different drivers use the same interfaces and It's it's no rocket science. There's a few things When it comes to layers you have to store it you have to compute the diff Between two layers you have to apply it if you have to extract it Sometimes depending on what you want to do you may want to compress it or decompress it but extending it for new drivers is Is something I can only encourage if you have to use case we would be happy to know it the same applies to different store transports also in containers containers image for instance, it's a It's a fairly stable interface. We have there sometimes we need to change it, but if It's upstream We commit to maintaining it. So basically it's the same thing as with the Linux kernel As soon as you get it upstream the maintainers will take care Any more questions? All right, so I guess we hang out we were going to hang out for the next hour as well because unfortunately Dan Walsh is is not here So after that if you see him, please remind him he owes me one or two beers so Oh, we still have a few minutes if you want to use the restrooms or refresh yourself or get some some drinks I'm gonna wait for you. All right. I think the talk is supposed to start three minutes ago so As you see, I am a taller younger version of Dan Walsh with a little bit more hair Unfortunately, he didn't make it he had some some problems traveling because he wanted to travel to Defconn in India and Something went wrong with the with a visa and now he's stuck in in Boston But I'm I'm Valentin. I'm working in Dan's team on Popman build a cry oh Basically a little bit on all the things They're more kind of a generalist So I hope to replace Dan if you see him remind him he owes me a beer. He will know why So Oh, I'm gonna throw that at his head. I will have a lot of fun so in in the talk before I I was talking about somehow the untold features of potman Which are mostly things that darker doesn't support so Potman and Docker they share a lot of features on the CLI so when the dance team Came up with the idea of creating a demon-less container engine The decision was clear to also Basically imitate Docker on the CLI why because we're all used to it. They did a great job Everybody knows it scripts already shell out to it. So Just sticking to it may make sense. It's a de facto standard CLI so in in this talk Dan wants to show how Potman works how you can migrate from Docker to potman Explaining a few of the technical details that we're that potman has explaining the architecture How potman uses username spaces to also implement? Rootless containers So if you if you don't like the talk you also have to complain to dad So let's get the demo started Let's first execute everything is as rude as we're as we're used to Here we see the version. We have a remote API version We see the go version which is basically also used a little bit for for debugging and the OS and architecture That's being used by the go go compiler the remote API is implemented in in var link and we're Really committed to make this a stable thing for potman the Cockpit a cockpit so which is also used in fedora is basically using the var link API to do it So this one, excuse me. I have to make it a little bit shorter So now. Yeah, there we can do it. So if we do a potman info it's similar to Docker info where It displays basically most things that potman uses for execution and Also, most things we need to understand in bug reports. So what we can see here is conman Those who have attended the talk before no conman already so conman stands for container monitor and it's a small small small small binary sitting between potman and the container runtime for instance run C and what it does is it Keeps a socket open that potman can attach to for instance when we do a potman exec it uses this socket it also keeps a bunch of file descriptors open for instance to keep ports open and Records the exit code puts it into a file So potman actually knows what the exit code of the container was Which in case of Docker for instance does container D and then reports it back to Docker It also does logging so conman is so when saying potman is rootless It's factually true, but a little lie Just a little a very little one because from a technical point of view We need some process to monitor the containers and this is what caught one is only this However, we believe conman is the smallest demon possible for this task in it has 76 K it's seven seventy six K big So there's not much a Lot of other things we have the OCI runtime the path to it also the version and the commit Unfortunately, there's still no stable or no 1.0 release of run C Because things are popping up all the time and are blocking the release So docker is using a different version of run C than potman and maybe cryo is using another one because things are changing rather quickly and When you have when something is changing quickly you maybe need to pin to a specific commit where You know, this is just working container D does the same. So this is why we why we display this information as well and A lot of other things the uptime Yesterday it looked nicer. I had around 10 days, but I rebooted this morning It shows also a bunch of registries and Search registries. So search registries is something I like and hate at the same time I like it because it's nice for users and I hate it because it's really painful to Develop and maintain the code in the background. So when you do a docker pull alpine Docker is doing a lot of things for you because it's Resolving the name into docker.io slash library slash alpine colon latest and and We wanted to have the same thing also for other registries. So when you do Potman pool yada yada and we don't find yada yada on docker.io Portman will go through the list or basically the containers image library will iterate over all items in the list you see here in the search in the search list and Contact the registries one after another. So then it will ask a registry fedora.org Do you have yada yada and then iterate iterate over it? So it's something you can you can configure So it's something really nice to use But it adds a lot of complexity in In the code because there's a lot of special casing and docker.io is always something special and will always be something special then we have a bunch of Config switches an option for the container storage. So there's containers storage con that can be configured For all the tools that are based on the container storage library there you can Or there you can alter for instance the paths where images are being stored where containers are being stored You can control which back-end driver The tool should use or storage storage back-end should be using may be butterf as if you're running on open SUSE may be overlay if you're on XFS or X4 and You can also point it to The mount program that is being used which is something we need for For rootless. I'm gonna talk a little bit more about this Later in the next talk in the following talk after where we go into the details of build up Same for the image store. Well, I guess you you get the idea of what you can do and what you can see So now I'm gonna increase it a little bit. So here we have a docker file From alpine we can set the environment a few or a label. So add some metadata To it and the important part is the but here Everything before you can more or less ignore for now. This is just to to run the demo, but the but is Basically instructs potman to build the the docker file that we're that we're seeing above I was talking too long So I need to re-enter my password. So here we can see we can Execute the docker file parsed run everything. We're pulling it at the moment and Well, uh, we have a new image. So the last digest we are seeing here is the image ID of the Container image that we were just building before. Can you still read it? Okay, perfect. Then I might just keep this font size Next one is images Like docker images, we have a potman images and have a look at the container images that we're here RMI stands for remove image and One thing that has been asked for a long time upstream a docker is please give us a switch to remove everything because it's a common feature because Things get messy quickly But the maintainers didn't wanted potman supports it it's trivial you list the images you iterate over them And then you remove them so it was no no rocket science But something really that improves the the the usability of the tool. So here we see we had the two images before Local house my image and the alpine one. We see that both have been removed so now Now Dan wants to do more cleanup potman RM does the same but for the containers We remove all the containers. So we had One for building and we executed a few before so basically this is The left overs we're having now. It's the nice part. Everybody likes potman because we can execute it as non-root and Actually lately we have been approached a lot by the HPC community high high performance computing so they can't really use docker because They don't get root if you have a big HPC environment, you can do nothing and Well for docker at least until now you needed root for a long while now it allows for executing it It also per user Nonetheless, there's a lot of interest In rootless and it's really really cool cool to see the different use cases. They're having So let's dive a little bit into it We can do the same we pull an image. I hope you're not downloading too much here. Okay. I already had it here So you can see here in the in the output that The pool or the download has been skipped because it already exists in the source Sorry in the destination transport in which in this case is containers storage This year we have been working a lot on improvements for pool. So it's easy to compare performance, especially when you migrate and Let's say in internal customers, I dreaded said like well It's working, but the pools are are really slow and that's that's a bottleneck So what we were doing before is all layers have been pulled in One after one after another. Sorry. I'm missing my English at the moment So the obvious thing was to parallelize everything, but this was a was an a rather non-trivial change because The the tools spotman build a cryo scope you all share the containers image library and well, there was a lot of synchronization going on but Now we're faster than Docker, which is which is cool Also considering that it's a demon less architecture. So the process has to be initialized before and Yeah, that's really nice. So rootless. We can also run a container. We can now Show all non-privilege containers. So if you run potman ps You can basically list the containers dash a will list all so also the ones that are not running anymore. You can't see however containers from Root so it's different storages. We don't have access there. So this is something That is different if you pull an image For root and pull the same image for non-root or for a user you have to pull it twice because it's just different starches Same goes here potman images now We do a pseudo potman images and we see well, they're not having the same the same containers Well, okay, they're Close to the same because I'm running a lot of root and rootless with the demos here So now Dan wants to show a little bit Behind the curtain. How is potman using user namespaces? How is this working? So I'm gonna read a little bit the demo will now unshare the user namespace of a rootless container Using the build a unshare command. So what build a unshare does it? Creates a new user user namespace where we have more privileges than in the current one Potman now supports the same actually divianche you added potman unshare, right? That's cool. So first we have a look outside the container and look at the Etc sub UID so what we're seeing here is the UID map that the User in this case Valentin, which is me has been assigned to so the the first item And the colon's separate list is the UID or the username the second one is a UID of the range That we can use in user namespaces. So when we create a new user namespace the This is basically the starting ID that we can use and the last item the 65k is the range or the the number of UIDs that can be assigned to so Zero will be 100k one will be 100k one and this goes up to a hundred hundred 65k and so on so when we do Now we create a new username space We're leaving the one to create a new one and if we now have a look at the UID map that is assigned to me so we can see that Zero in the username space so the root in the username space is UID 1000 outside the username space which Happens to be me So what we're basically seeing here is that inside the username space the process is Root and had the same has the same rights or has root privileges Inside this username space, but outside in the parent username space It has ID or user ID 1000 so even if we manage to break out of the username space Now we can't do many nasty things or only as much as the user can do in any case alright, and then we have the UID or the ID one which starts exactly the way I described before so one starts 100k and then all the following 65k's So think now Dan wants me to exit we clear and Now we're gonna look at the pop-in username space support. So what we're seeing here is Something we can also control on the CLI. So here we have another UID map a new range starting at 100,000 and With the length or the range of 5,000 we do a sleep and create a new container and now I May regret it because I should have cleaned up my processes So I'm not sure if the next call will actually succeed, but what we're doing now. We use department top command On the latest container. So the last created container and we're going to display the user and the host user H user and Grap a little bit on it. Oh nice. Okay, we see that the host User is 100k which is exactly what we've been specifying in the UID map above but is rude inside and Later I'm gonna show a little bit more about pop-in top. So pop-in top has been extended Quite extensively to make it easier to explore what's going on in the container And If we do the same on the host we see here That it has actually But basically here here we see the the process running on the host. This is the ID Here we have the PID and This is basically how or meant to illustrate how a UID map works. We can do the same starting at 200k Same as we've done before and Here we go Give questions perfect All right, let's dive a little bit into potman fork exec model So this is basically the biggest differentiator between Docker and potman. So Docker has a client server model. So whenever you execute Docker run You're not talking directly to the Docker binary or in some in some sense Yes, but there's another one called the Docker Docker D the Docker Damon So what the Docker client does it sends a remote procedure call to the demon and The demon will do all the heavy lifting for it this is In one sense great because everything is centralized in the demon all state is there It makes development faster and also it raises a lot of sources or potential for bucks Mainly because Potman and the others have to synchronize on disk, right? We can't use mutics is we can't use semaphores all the cool things That doc can do we can't However, it comes at a price. It makes everything a little bit slower and It's quite hard or harder to integrate into existing Linux's so for instance when you want to use it in system D unit files, you cannot really Use the C group restrictions on it because they will be Only applied to the client process, but not to the demon and hence not to the to the container processes So this is a little bit a little bit harry potman has a fork exec model which allows For an easier and smoother integration into into the system same applies to to audit for instance I guess I guess Dan will show it in a few minutes here in the in the demo and So all containers Executed by potman are child processes of potman itself and this changes everything So the lock I lock and you ID will be will be following we can see who has been executing what on the system it's easier to be used in system system D and so forth so Yeah, the Dan will tell it now. So if we look at the lock and you ID it has The the ID one thousand so the lock lock and you ID is something that is set once to a process and Can never be changed. So the Linux kernel make sure make sure so it's actually in the in the proc structure of each process the Linux kernel will make sure it will never be changed again and This is following me As soon as I lock it so whenever I lock in into my machine and execute something the lock this lock and you ID is Attached to all processes everything. I do is attached there and this is what I'm going to show now so if I run potman a fedora container and Execute a lock and you ID. Well, it's the same If I do the same with docker Ooh now I should make sure To also start docker If I do the same with docker like holy crap, what's that? well, the parent of the docker demon is system D and System D is the in a process. So there is no login attached to it So the the value we see here is basically an overflow This is this is something that can can never happen on the system and This is what I mean. It comes it comes at a price to have a demon So for instance the in the audit subsystem, let's say we want to we want to secure or watch ETC shadow right there There's a lot of necessary information that we might want to know what's going on in ETC Shadow because if somebody has right access to it. Oh, well, then then they can log into the system and do whatever they want So let's put it under audit control. So the audit subsystem will will watch it and now we're going to create a Fedora container we're going to volume mount The root on my host into the host or into a path on the container and touch ETC shadow when we are now using a Usearch and check what has recently happening on the ETC shadow file. Well, we see oh Valentin has done something there. So we it's being locked the system Can assign it or map it to my user and As an admin or if I want to see what has happened on on my system I know who did it. So now it's me to blame if I do the same with Docker And Check who has done it. It's on set. Well, this is this is a problem. And this is actually a Or is a blocker of using Docker in certain environments because certain security certifications actually require That everything is auditable Yes, now we come to the top features that I was mentioning before so now let's first start a container and Now we use Portman top to display the PID in the container and Well, maybe we want to do some debugging or understand what what actually is a container It's really helpful to use Portman top HPID. We map it to the corresponding PID on the host Which is basically the PID in the parent user in pit namespace This is pretty this is pretty nice. We can also list the se Linux label we can check if the If a second if a second filter is enabled for the process which can be Which can be it can be useful to figure out or see if my second profile that I'm writing and attached to the containers actually effective We can also list the capabilities. So there's a lot of different capabilities that the process can have one of them and which is the biggest one is The caps is this admin one, which is close to be something like root and In the next talk I'm gonna talk about why we Or how we support overlay For non or for rootless containers because if overlay for mounting requires caps is at me All right. This is a demo. So something must have gone wrong here But one thing I want to look at potman top. Yeah, we still have a lot of time to look a little bit on At what potman top supports so here here It's a little bit funny because potman top is actually a ps But potman ps shows the the containers. So Yeah, well, it's an inheritance from from the Docker CLI, but I wouldn't have known any any other way So in case you wonder Why I'm now comparing it to ps1 because that is actually what has been happening before so Docker top or runc top, they're basically executing ps and try to and do some parsing and Try to map it the the PID in the container to the PID on the host and things like this and P as is nice, but it's really old and it's not meant to be used in a way. We're using it here Because it just prints things right it prints and I it prints the output in a nice tabula tabula form, but The columns are just split by white spaces. So it's it's very unambiguous in a way that we cannot figure out What where is the border of a column? Which is pretty much pretty much a breaker or a blocker for certain combinations. For instance, the arguments or the command Can especially the arguments can certainly have white spaces, right? So we cannot just split at white spaces because of not we We have a pretty ugly ugly table in the end. So Other tools like it for instance or LS you can split everything by by null bytes, which is which is nice, which is a unique and Very easy easy for formatting long story short We had to implement our own ps and now we're parsing everything in the proc of s which takes a little bit time But there's no other way on Linux than parsing proc of s But it allows us or it allowed us to ride the library the psgo library With the the the thought of Containers so it's aware of what a container is it will join the necessary namespaces to extract the data So on and so forth. So at the beginning we have a list of things of all supported ones So here we have the arguments of the command the binding capabilities the effective capabilities the Inherited capabilities and the permissive capabilities Don't ask me the difference I always have to read the man pages and every time I feel stupid because I don't understand them at the beginning I somehow always look at the effective ones because those are the ones that are actually Effective for the process. We have the command we have some time group the host group labels and so so on So you can you can use them in in different combinations it will also if you're running on Ubuntu or on Debian you can also Inspect the eb armor Profile that is currently attached if I recall correctly this should be also will yes That's also hidden behind label. This is why we say security attributes attributes and not as eLinux for instance so Potman the name of potman actually stands for pod manager What is a pot a pot is a concept that we inherited from Kubernetes? It's a group of containers that share certain resources Most likely Sorry the the pit namespace the Network namespace and then depending on what we want else to be to be shared So potman supports this concept as Basically from from day one So here we can create a new Pod which is called pot test now we create a new container And attach it to the the pod. I think This is actually wrong. This should just be a Potman create I guess that's a this is that's a typo in the echo statement here So if we do a potman create dash dash pod we can specify or tell potman to create a new container and Put it into this pod we do the same again and If we do Potman ps we will see that oh right There's no container running because we just created pot. We created the containers But we didn't start them So now if we start it, I don't know what's wrong in the script If we do a ps now we actually see that two containers are running I replicated it locally And it's running so there must be must be a bug in the demo script. Just just ignored Yeah, now you can trust me for sure the two containers are starting. So the concept of pods is really nice Many users approach us for how can containers communicate? How can I name containers because docker supports? Creating new networks and attaching names to it and it's able to map Container name name to basically the IP to the network of the container so you can use Or talk to or you can ping pot tack Colon and then the port This is something potman does not yet support. Why because CNI the container networking interface doesn't support it yet It's an open standard, but we're working hard at the moment and prioritized very highly to get in working Long story short our answer is always put them in a In a pod and then talk to local host and then the port on local host So this is a little bit hacky. It's it's not hard. It's not really hacky. It's it's just a different way of Letting containers talk to another I would argue if you need containers to talk to another Then it may be wise to put them in the same pot as well Kubernetes does the same and Now we can We can stop it well Maybe I was talking too long and the pot doesn't exist anymore, but here you can see it really doesn't exist anymore. We remove it Relisted there is no pot anymore and this is the end of the demo so I'm not Dan so I'm not sure what else he wanted to talk about but maybe you have some questions. All right the question was if I Can also create a new pod at container creation, right? So here in the demo I was first creating a pod and then creating containers and Which were then part of this pod? Yes, you can you can So potman create and potman run They basically share the same options the same flex so you can also do it for potman run. So if I do now a Potman run. Oh, I Usually I don't want to do demos live because I'm I'm not as much as a cowboy as Dan is But we're in this together. So you can help me Right so Here I run It should be like this So we run a new container or a run is basically a create and start at the same time we say We want to attach it or put it into the pot, which is called live if I do a Potman pot list we see bar and food, but life doesn't exist and We run it we detach from the container immediately and run top. All right. I thought it would work. I Was certain it would work alright either either I was Wrong or we added support for it at a at a later point. Let me try out something so This is currently not possible apparently I was convinced we support it any other questions. Yeah, so much So the question was how much support or how much work is going on in the Linux kernel to better support? rootless containers I I Honestly, I'm not able to mention all of it because I just have a limited focus and I'm Using the kernel. I'm not much working on it lately, but there is a lot of stuff going on a Big problem always seem to be file systems, right? overlay for instance requires Capsis admin, so it's not usable by a non-root user, which is why we have overlay implementation in fuse or in user space So there are a lot of workarounds going on then one big Thing or one big Topic that is also going on is what is a container? Right There is no such thing as a container in Sorry in the Linux kernel, which is also why there's the cool t-shirt I'm not sure if you still there the containers are Linux logo, right? There's a container is nothing special. It's in the end every process is In some namespace under or in some C group, but the ones for containers are just a little bit different However It would be nice to know which processes are running Within a specific container. So there is the idea of adding Or introducing the concept of a container ID So when you create the the inner process of the container so the first process in this namespace to attach an ID to that similar to the login you ID which cannot be altered anymore and There's a lot of work going on From redhead engineers, but also from canonical from SUSE. So there's a lot of people having interest in it But it's hard to find consensus How to do it where to do it? So I think Divyanche I guess it's it's a tomorrow or on Saturday the Google summer of code presentations On Saturday so here here in the front row is sitting Divyanche He was working with Dan Giuseppe Scrimano and me this year on a Google summer pro code project and the idea we had was Basically to add another feature to potman potman allows to generate system d-unit files already it allows to generate kubernetes sample file and Dan is all about security, right? So we thought how can we make potman more secure how we can help users to Secure their containers and lock down the containers a little bit more So we also wanted to generate a sec comp profile. So what sec comp does it's basically a filter mechanism in the kernel where you can configure system calls to be allowed or forbidden and This goes it's on a it can go go down on a very fine granular level also down to the arguments but this is basically what every container engine out there does and The most portable way is to have a whitelist approach So it allows For so this this list Includes all system calls that a container can execute and There's a default one that was created a furious back by by Jesse for sale from at the time She was working at docker and this was a really herculean task Because you can imagine This standard whitelist is currently used for basically all containers out there So finding a set of system calls that every container can execute without breaking is Is really really really really tough yet All other system calls are implicitly forbidden so it's it's a way of having yet another on yen layer of security around the containers and for certain things and Well It's but for sure to support all containers out there or nearly all containers or workloads out there It's very loose, right? Do you remember how many system calls are enabled? It's a hundred eighty or something like this All right, so it blocks it blocks 44 So it depends really on on the kernel version that you're using But it's it's obvious that by the we can erase many of the two or three hundred system calls Depending on the kernel version. We don't need everything if you really want to lock lock it things more down It would be nice to have an automated way of doing it because It's non-trivial to figure out which system calls might be executed Within the container. Yes, you can s trace But I guarantee you will miss something because they're still run C and run C will create the process and There are certain system calls that are required For run C to then create the internet process long story short We were working on that with Divjan and Divjan really really did a an amazing job there We have an open pull requests at the moment that we're that we're working on to support exactly that and to come back to the a Concept of having a container ID in the kernel. This would be amazing to have Because how Divjan solved it was So we were looking at a few things or at a few mechanisms to do it We could use P trace, but this this is slow and has an impact on on the performance Which may have some side effects on the control path and the execution paths within the container So we might miss things and then Divjan said, why don't we use ebbf everybody's talking about it? So now let's let's do this So we were looking at that and in ebbf we have access to a lot of data of a process in the kernel and We were trying to figure out well now we have this ebbf filter And we want to filter or we want to lock the system calls that are executed by a given process But Which one is inside the container? Having an ID which is in the proc Structure of a process in the in Linux would be ideal because we it's one comparison It's super cheap right we we can be certain that this is a process in the container Right that can be there can be dozens or hundreds at least in theory Processes running within the container and we need to figure out is this process on the host in another container or something completely different so we had to work a little bit around this and Do an approximation based on the namespaces and the IDs so if the container in or if processes inside the container create new namespaces Sorry, we might be missing information But this is as close as we can get So this is something from our perspective a very recent story or it's still ongoing at the moment Where work is going on in the kernel? It's not yet finished and it's It's hard to find consensus because there are many people We have different views different use cases different opinions and finding consensus there is Something long. I guess the first article on LWN is already three or four years old And we were we were very we were full of hope like okay, maybe perfect timing and this gets in Into a kernel. I guess it was 5.2, but then Eric Peter man. He said like us. Sorry now Some some folks from from canonical LXD LXD They said no they want to rethink how how they're gonna do it, which is which is great They're doing a lot a lot of great stuff a Lot of other work is going on with respect to security So at SUSE for instance, there's Alexa Sarai who is working on a lot of on security in particular for for containers and a lot of attacks also recently came by Escaping via the file system. So you escape the container at specific execution times for instance via sim links Or there was another one. I Had I had to read it up How exactly it works earlier after the chaos communication Congress there was a CVE released By replacing I I I the problem was that run C was dynamically linked and you still see run C in the container and If you handcraft container image with the militias lip see Then with a few nasty tricks you can basically escape the container and then have access on the host. So they are There are there's a lot of work going on and locking things more down. I would say in the kernel There's two big things which is rootless How sorry how to get this done and then there's also security so the question was Well, there was a lot of information actually I actually I think last week. We were in a call I didn't see the face, but I remember your Your your voice, but I was silent because she's ever he's doing all all the The C group work at the moment. So I think if I rephrase Please tell me if I rephrased it correctly is which kernel APIs are available, which things you can expect to be available in a container I would say everything and none Because it's it's really hard to say what's what's inside a container and what is a container so from The it depends what we're talking about if it depends on on the mount points This is something we can more or less control If it depends or if you're looking at the syscalls It then it depends on the kernel version that we're running and on the second filter that we're using at the moment And with system D I'm not entirely up to date with the with the recent work on on C groups, but as far as I understood that One problem is that the C groups we've won as you mentioned is it doesn't allow delegation Which is something that C groups we to supports, but then we have the problem that run C doesn't support it However With Fedora 31 we will enable C groups V2 and then things have to change. So I How it looks like is run C will be too late for Fedora 31 to support C groups V2. So we will switch over to C run which is an implementation by also Giuseppe who brought an also I compliant container runtime, but fully written in C right run C is written and go and Go has a lot of limitations in a sense. It doesn't allow for work and things like that Then there are the runtime has some implications because there are some some routines running in the background So you cannot fully or you cannot unshare in all circumstances So there there are some weird tricks needed for run C to execute run C and then exit and then do a basically a double fork and things like this which are not Not that hard to do in C So I hope that with respect to System de-support in the containers not everything is supported as you just mentioned because some things There are just some limitations But with the C groups V2 work, I guess things will improve dramatically so I'm I'm not working on it. So I cannot tell you for sure Giuseppe is definitely the person to talk to My guts tell me I would be surprised if we wouldn't use system D for it So there's basically two ways of configuring C groups or two managers We're calling it one is C group of us and the other one is system D and I'm pretty sure we will use system D also for all the delegation there. All right Any other questions? I guess you need coffee as much as I do So I hope to see you in 35 minutes, then I'll be talking a little bit up about builder Thanks