 Okay, next up we've got Ralph Haferkamp who's going to be talking about Cata containers on OpenSUSE. Yeah, hello and welcome. Nice to see so many folks here. My name is Ralph Haferkamp. I am working at SUSE. I recently joined the containers core team there and I've brought with me Dario Fagioli. He's working on the virtualization team at SUSE and together we are talking about Cata containers and what we are doing at SUSE with them or how we integrated them into OpenSUSE. But first we are going to give a short overview about what Cata container actually is and how it works. So if you go to the Cata containers website you will see a phrase like this. It's a container runtime that provides stronger isolation by using hardware virtualization technologies and then the sentence goes on and on with some more words but this one is the most important one. And well to show you what that means let's first take a look at how containers normally work. So in the normal case you have your host with the Linux kernel and every container basically is just a single process which is isolated from the host system by utilizing things like CGroup, SecComp, MMO or E-Linux namespaces and things like that. But still they are all sharing the same kernel, they are all running on the same host and so there is well it's not as well separated as it could be. In the Cata containers case, Cata it's another level or layer of security around that by including or by running all these containers inside a small virtual machine. So every container is a really tiny virtual machine with a separate kernel. Inside that virtual machine also the containers are utilizing the standard security mechanisms that containers usually use like namespaces and so on. But as the additional layer you have the hardware virtualization which is like isolating the system even more from the host system. And Daria will have some details about what additional security that actually provides. Yes, so very quickly as Lars said, Cata containers uses virtualization, adds virtualization to the picture of containers, so why doing that? Why adding virtualization and why not just containers? Yeah, well if we think of a scenario where we have potentially untrusted code running inside your containers then one reason would be attack surface meaning that in a standard container setup then the attack surface from inside the container to the host if your malicious code running the container wants to attack the host. The attack surface is the one of, it's the interface, this is called interface of the shared kernel with all these variants which is huge. Of course you can restrict it but it's still shared kernel among all the containers. While on the other end if you have virtualization then the attack surface is the one of the hypervisor even if you add to that the device model basically so all the abstractions that you need to create virtual device then the attack surface is still smaller. Then another point, defense in depth meaning that to actually escape with virtualization in the picture to actually escape if something running inside a Cata container wants to escape and do harm at the host level it needs to escape two layers, containers and virtualization. So as a matter of fact what you get is improved isolation as Ralph already said meaning that for example even if someone manages to escape from the container for example and manages for example to exploit one vulnerability to crash the host if you are running in a standard container setup then it can actually crash the host and cause denial of service for all the other containers running there while on the other hand with virtualization and so in Cata container setup it can only crash it's the virtual machine where it's running inside. And of course this all comes at price in terms of overhead mostly CPU and memory overhead and of course we want for this to be as small as possible even at the price of having reduced functionality with respect to as compared to traditional virtualization because we don't need for example all the features and we want in fact a small and fast kernel inside our own little VMs and we also want a small and fast virtual machine monitor and to do that you basically configure the headlil so that it's tiny and little and it only includes all the features that you need and as far as machine monitors goes you can use QMU which is the most common choice in traditional virtualization but you can also use others which have different properties and in OpenSUSE right now as you will see later in our demo we are using a smaller kernel than the one we provided by default it's called KBS small, it's a flower, it's a flavor of our kernel although this should be regarded as a temporary solution and we are investigating what would be the best long-term approach and solution and for a virtual machine monitor right now you would get if you use Cata Containers and OpenSUSE you would get them running and using just plain standard QMU but again here we are investigating better approaches including Firecracker and including using QMU with microVM which is something being actively developed these days and I'm done another feature of Cata Containers is that it's actually implementing the OCI specifications so the advantage of that is that you can just replace your current runtime in most cases RunC I guess with Cata Containers and all the tools that use RunC will seamlessly continue to work with Cata Containers but now leveraging the enhanced security so you could just like use Portman or Kubernetes with Cryo or even Docker and run Cata Containers using that without any, well just very small changes in how you use it important to mention what Cata Containers is not as well it's not like a tool to run your normal virtual machine workloads inside Kubernetes or something it's really about running containers this is a small overview about the internals of Cata Containers basically what's outside the box here could also be Portman for example and Portman would call Konman which would call Cata Runtime in the same manner as it would call RunC Cata Runtime then will create the sandbox, the virtual machine basically calling QAMO and talking to it while it's management socket and that will launch the virtual machine that virtual machine image will as the single process start the Cata agent and that is then being used or talked to within a GRPC on a VSOC interface or on a virtual virtual serial interface that depends a little on the configuration to launch the containers inside the sandbox and also the standard in and standard out and these things are going via that VSOC thing I think that should be enough for how the architecture looks like this is basically summarizing it some words about how storage is implemented in Cata Containers like the container root file system and volumes that you attach to your container they are currently shared or injected into the container using the 9PFS file system and I don't know if you ever use 9PFS it has some performance issues it's pretty slow and there's work underway to replace that with the VIRTIOFS which is I think recently landed into the kernel in 5.4 or 5.3 I'm not entirely sure and also VIRTIOFS support is going to be part of the next QEMO release and that will give much better performance than 9PFS some words about how you could use it if you use OpenSUSE if you're on tumbleweed it's just easy you can just install it it's available in the standard repositories just call the install Cata Containers and it will be there for other distributions we are providing it via the DevilCubic project in the OpenBuild service and it's there for LEAP various releases and also for SLES if you want to play around with that there are two packages actually there's the Cata Containers package which contains the runtime and then there's the other package which contains the specific kernel and init.rd for running the virtual machine so now for some short demo first thing I'm going to show is how to use it with Podman and as I didn't sacrifice enough chicken for the demo god I'm just playing a video this one I hope that's big enough for everyone to see just to prove this is a tumbleweed system which has the Cata Containers packages installed it's basically the latest packages there's one small configuration change that needs to be made if you want to run Cata Containers rootless with your normal user instead of root you would need in the local configuration to adapt the temp directory to your user's run path and additionally there's the need to enable the runtime of Cata to be a supported runtime for Podman there's the section run times in the pod.conf which lists all the available run times and the standard one is run C this Cata runtime thing needs to be added so that Podman can find the runtime if the setting is left empty it will look into the standard path like user bin for the binary otherwise you could also specify a concrete path where it should be looking then basically for this demo actually I prepared a small next cloud image and configured it so I will just launch it now which is already configured and you can now see that if you run a command in that container so the container is launched and now I run the unit name there's a command there and you see that it's running the KBM small kernel which is different from the host kernel just as a proof that it's actually using a different code of individualization and here you see the application in action I guess most of you have seen this just to take a look at how it looks from the processes just looking for the common process seeing what it launches and taking a look at its child processes to see the whole architecture in action so you see that Konmon is talking to Katashim which is talking to the agent inside the virtual machine and collecting standard in and standard out and processing the signals of the container and there you see the virtual machine running inside QMO that's basically it for the Hotman demo not sure do we have enough time for... OK, yeah, let's try the other one I have also prepared something to show how to run it in Kubernetes basically this Kubernetes cluster I'm using is based on Qubic so it's just a simple one with just a single master and three workers all nodes need to have the packages installed of course there's a small configuration change that needs to be made to the cryo configuration similar to what we've done for the Lepod conf you need to enable the runtime it looks slightly different here but it's essentially the same time that you have an additional runtime in this case the runtime will be called CataRuntime and we'll also look for that binary then you need to restart cryo and also the QBlitz service to pick up the changes you need to do all that on all nodes of the cluster of course you don't know where your workload is going to be scheduled and then in this case you need to define a new runtime class for your cluster runtime class this is a feature that got introduced in some not too old version of Kubernetes where you can define alternative run times so that you can switch your workload to like you use a different run time you can have like one container or one port using the standard run C runtime another one using Cata in parallel on the same machine for example and to do that you need to have this runtime class defined and upload it to the cluster I'll skip a bit here now there's a runtime class called Cata and basically if you want to run an application this one is a sample application from the Kubernetes repository which is just a very simple thing it launches three Redis nodes and a simple PHP application which acts like a guest book and in this case I run the on the bare metal nodes or using the standard run C runtime and for the front end application the PHP thing I decided to put it into the runtime class Cata you can see it here that's all you need to change in the Jamel file to have it run using Cata instead of the standard runtime so basically this application just needs to be loaded into the cluster and then it needs to wait until it's running I'll skip over this and basically yeah well this also proving that the application is actually running and here I execute the same command as earlier for Potman inside the inside the container and you can see that it's also running in a different kernel than the host I think that's basically it are there questions? oh yeah what about startup time? sorry? what about startup time? it's of course a little longer but the project is trying to optimize for that and that's why they reduced the init-rd to the absolute minimum like you need the init-rd basically just contains the Cata agent and it boots in our case it's like I'd say two and a half seconds for booting up the virtual machine but it can be made even faster that's what we were saying about using smaller and tailored kernel and also a special purpose virtual machine right now what you get is yes a smaller kernel but a pretty standard QMU as a machine why we would want to use either a different machine than QMU or as I said QMU we've configured in such a special way which would reduce a lot the startup time for example and that's the memory footprint and this is something which is not currently there if you install the open source packages but it's really ready so hi there I was about to ask the same question and the follow up was like have you tried some sort of VM reuse for a different container or is Cata always just starting from scratch? currently it's always starting from scratch so there was I think there was once an idea to just start the virtual machines from a snapshot so like taking a running machine taking a snapshot and cloning that and then starting it up again I don't know if that's still an idea or how that really worked out but things like that can be done but there is currently no like for example pre-start or something like keeping a couple of virtual machines running in the background or something that's not there I'm not sure but I think there are things still like doing things in parallel so like preparing the storage while the virtual machine is booting so to parallelize things yeah it's for example it's booting up a very small machine and later on adding memory to it by hot plugging the network device when the kernel is already booting and then things like that okay we're out of time thank you very much