 Hi, everyone. I'm Rinal Patel. I'm a cryo-maintainer and I work for Red Hat. I also maintain Run-C, the OCI runtime spec, and participate in Kubernetes Signal. Hello, everyone. My name is Irvish Limonani, and I'm a senior software engineer at Red Hat, as well as a cryo-maintainer. Hey, folks. I'm Sasha, one of the maintainers of Cryo, and it's a pleasure to be here today. Hey, everyone. My name is Peter Hunt. I'm a software engineer working at Red Hat, primarily working on Cryo and other container-related technologies. Today, we're going to talk about what is new in Cryo since the last time we met at Kubecoin.eu. First, we'll go over the steps of setting up Kubernetes with Cryo. Then we will talk about the improvements we have in place for CNI handling, as well as the different types of workloads. Finally, we will talk about the enhancement made to Metric. Let's look at some of the basics of running Kubernetes with Cryo. First thing we need to do is install Cryo on the host by replying to run our Kubernetes cluster. We have packaged versions of Cryo available from multiple operating systems. As you can see here, we have Fedora, OpenSuzi, CentOS 8, Debian, Razbian, and Ubuntu. Packaging for Cryo is largely driven by request, so if there is a version or operating system that is missing, please feel free to open an issue for us. You can also build your own boundaries from source if you would prefer to do that. One example would be to get the latest and greatest Cryo that hasn't been packaged yet, but is available enough stream. We also have support for building static binaries, and our binaries can be found in our Google Cloud Storage bucket. So once you have installed Cryo in your system, either through the packaged versions or by building your own binary, the next step is to enable and start Cryo on the host so that we can run Kubernetes with it. Cryo versions walk in lockstep with Kubernetes versions. So what this means is that Cryo 118 is guaranteed to work with Kubernetes 118, same for 119, and so forth. So make sure the versions of Kubernetes and Cryo you're running match. Once this is all done, we just need to point the runtime endpoint for Kubernetes to the Cryo socket, and our cluster should come up successfully. I have a small demo showing this. Let's take a quick look. All right, I've already installed and started Cryo on my system. We can check this by using the system CTL status command. And as we see here, Cryo is up and running. Now, I have a simple script that I used to bring the cluster up using Cryo. So the main thing here is that I'm setting container runtime endpoint to the Cryo socket, and the C-group drive to system B. Everything else is just a few configurations that I would like, for example, the IP stuff for networking, and I would like to get a dashboard and stuff up. All right, so let's just call the script to get the cluster going. This will take a few seconds to start up. Oh, and we're using the local app cluster that you get from the Kubernetes repo. We're just changing a few configurations and pointing the endpoint to Cryo. All right, our cluster is up and running now. Let's grab the cube config and take a look at the cluster. All right, so let's export the cube config here. Let's take a look at the node. And as you can see, the node is up and running. Let's run a simple pod. I'm just going to call it my pod. So it's creating my pod. Let's look at the status of the pod. And as we can see, the pod is up and running, and my cluster is running with Cryo successfully. We can also use the Cry CTO tool to look at the containers created by Cryo. So if I do sudo cry CTO PS, I'll see that these are all the containers running. Currently, I run the Fedora container. So as you can see, that's the first one that's running. And it's that simple to run Kubernetes with Cryo. We can also use the similar steps to run QBDM as well as kind with Cryo. It should be pretty straightforward. And you can take a look at our installation and setup guides on the GitHub repo. They have much more detailed instructions on how to get started with Cryo and Kubernetes. Hello, everybody. Next up, I'm going to talk a little bit about Cryo and CNI and some improvements we've made to their relationship. Specifically, some better handling around stateful pod-based CNI plugins. For those of you that don't know, CNI is a common interface for container runtimes to create networking resources. It includes a set of commonly used plugins and defines an interface for new plugins to be created. Cryo has supported CNI since its inception, and CNI is supported by many other container managers. But Cryo really only loves CNI. Let's talk a little bit about that relationship. So it's age-old, you know, the lifetime of Cryo, but even so, the relationship between Kubernetes, Cryo, and CNI is a little bit hacky. CNI is a plugin-based architecture where there's a binary in the system that Cryo execs to provision the resources. This is fine and dandy for stateless plugins that are known in installation, like a simple bridge plugin between hosting container that's installed through the distribution's package manager. What about more complex stateful networking, network provisioning schemes, or more dynamic ones, where we don't really know what it is at installation time, but we do know at runtime? Often, the way that more complex CNI plugins end up working is a pod-based strategy. Calico, OVN, and OpenShift SDN all work this way. Basically, it's a privileged host network, CNI pod that's scheduled onto the node. This pod is responsible for setting up some state, and eventually, a networking plugin binary is put in the correct location by the pod. This binary is a pass-through communication tool for the networking container itself. When Cryo creates a non-host network pod, it attempts to exec the binary, and then the networking pod, and then that information is passed to the networking pod, and the networking pod returns the information, and then Cryo can continue to create the pod. In addition to the architectural complexity there, there's a couple of pitfalls with this approach. The networking resources are provisioned by a pod, what happens before that pod is created, and what happens when the node is rebooted and the pod is restarted. We're going to go into a couple of improvements that Cryo's made in these two areas. To start off, we'll start with the former one. What happens when, before that pod is created? So in this former situation, Cryo would attempt to create a non-host network pod as soon as the request came in from the Kubelet. On node startup, there's a stampede of pod creation requests, totally unordered. The Kubelet's not very good at ordering pod creation. It creates all the static pods, and then it creates all the pods that it gets from the API server. The host network of the CNI pod is one that is kept in the API server, or provisioned, asked to be created by the API server rather than a static pod, along with all the private network pods that we used to have this situation. The Kubelet asked Cryo to create a private network pod. At the same time, Kubelet asked to create a host network pod. Cryo fails this first request because the host network pod that is the CNI pod is not yet up, and the networking pod again is not set up. Cryo then processes the host network pod request. The CNI pod takes it and returns its creation, and then the Kubelet re-requests the private network pod, and Cryo now with the CNI pod is able to complete the request. This is okay. The pod does eventually come up. We're emphasizing the eventual part of eventual consistency, but it is wasteful. We spend this whole cycle not creating the pod. It would be better if, and it is better now that we've improved it by waiting until the CNI pod is up. The Kubelet asked to create a private network pod. Cryo notices that the CNI pod is not up yet. It attempts to exit the binary, and it fails for whatever reason. Cryo stalls the Kubits like, hold on, wait, I'm not ready yet. Kubelet goes forward requesting all the pods as it does, and eventually it'll get to all the host network pods. Cryo is able to provision host network pods because they don't rely on the CNI pod, and eventually one of those host network pods will be the CNI pod, and then everything goes as expected. CNI pod is created, returned to the Kubelet. At this point, Cryo realizes it can unblock all of the other pod creation requests as the CNI pod is now up. So it asks for a network for all the other pods, gets one, and returns it, and everything's hunky-dory. The next situation, if you remember, what happens when the node is rebooted and the pod is restarted? When the node is rebooted and the pod is restarted, Cryo, so on node boot, Cryo first checks the disk for any containers that it used to have, and then it attempts to recreate them. Unfortunately, on a reboot, this will fail because all of the containers have some information in TempFS, and that TempFS is no longer there because of reboot. So the Cryo cleans up the information of the containers, which ends up being the container storage, but that isn't everything that containers provision. In this case, if a CNI pod provisioned some resources that wasn't cleaned up after the reboot, then Cryo would not attempt to call CNI Dell the command to delete all of the CNI state, and then state would leak. So this is obviously unfortunate. We don't like leaks. So we've since improved Cryo to now attempt to clean up the networking resources after no reboots. So we see this reboot here. Cryo attempts to clean up the stow pods as it did before, and this time it attempts to ask the CNI plugin to clean up the resources. But the CNI plugin isn't up yet, so that actually fails. So Cryo will keep retrying to do that, continually polling to see when the CNI pod is up. Eventually, the Kubelet will start up, ask to create all the pods. Cryo will prioritize the host network pods, and one of those host network pods will, hopefully, be the CNI pod. Cryo will ask the CNI pod to be created. It will be created. At which point, Cryo will realize that the other CNI resources can be cleaned up, and Cryo will do so. So we've now short up that leak, and there's no opportunity for the CNI resources, IPAM entries, or the ETH entries. They're all going to stay around. They're all going to be cleaned up when they need to be. All in all, Cryo, by focusing its networking stack on CNI and Kubernetes, is able to tailor its behavior not only to the Kubelet, but also to the idiomatic way that stateful network provisioning strategies work, where it's often pod-based and often behaves in these very certain ways, and Cryo's able to tailor its behavior so that it can account for those. This shows that Cryo, along with CNI and Kubernetes, is a very good choice for your CRI implementation in your Kubernetes cluster. Thank you. I'm going to talk about a new concept that we introduced in Cryo called Workload Types. We came across a problem that motivated us to introduce this concept. The problem was, is there a way to run all of my cluster components at this control plane pods and per node pods only on specific result CPU cores? So there's a clear separation between the cluster components and the end user workloads. This feature allows you to add annotations to your pods to specify their workload type and then Cryo matches on those annotations to modify those pods. These modifications are declaratively configurable in cryo.com. Another use case for this feature is username-space support. Cryo is experimental username-space support that uses runtime classes today, but it fits more cleanly into the workload types concept and we are working on refactoring it over. Let us work through an example of configuring workload types. This slide has a snippet that one can drop into a file under hccryo cryo.com.d or directly add to cryo.com to configure a workload type that we call Bend. The activation annotation is the annotation that needs to be added to a pod to activate this workload type. The resources section has resources that are modified for the pods that match this workload type. In this example, the pods are getting the CPU set modified to run only on CPUs 0 and 1. The annotation prefix can be used to overwrite the default configured resource settings from the pod. Let us move on to a demo of this feature. I am going to demo the workload types feature here. I have a drop-in here under hccryo cryo.com.d That uses the same example from the slides. We have the activation annotation, the annotation prefix and the CPU set that we are trying to pin to. I have a local up cluster with Kubernetes and cryo 122 running here. We will run three pods here, one without any of the annotations and the other two demonstrating workload types. I launched a simple htpd pod. We will wait for it to run. It is running now. We will use cryctlps to get the ID of the container. Then we will examine the CPU set of that container. You can see that nothing is set here. It means that this pod can run on any of the CPUs on this machine. Next, we are going to run this pod, which has the activation annotation. It is running as well. We see here that it gets pinned to 0 and 1 as we declared in the configuration file. The final example that we have here is for a pod that does an override of the CPU set. It uses the annotation prefix along with the container ring. We say here that we want to change the CPU set to 2 and 3 instead of the 0 and 1 in the default cryo. It is running and we successfully overrode and see that this particular pod is running on 2 and 3. This shows the workload type feature in practice and will try to support more features for the resources. Please let us know what more you would like to see added on the resources. Let's speak about the metrics enhancements we added to cryo 122. The third thing I would like to point out is that we added a new configuration option called metrics collectors, which allows us to enable or disable certain metrics on runtime. But if all our metrics are enabled, this option gives administrators a more fine-grained control of how they want to actually collect metrics and which they don't want to collect. We also secured the metrics endpoint via TLS, and this includes certificate creation and also validation of a certificate. This means if a certificate is not available on disk, cryo is able to create it automatically, and later on we can pick it up and exchange it. And the third point is that we are also able to rotate those CLS certificates. So if anything changes on this, like the key or the certificate itself, then cryo will pick up those changes and will serve those certificates. In case of any error, cryo will just report that to the user and will nothing do at all. Let me demonstrate that to you. So to be able to run this demo, we have to ensure that we are running cryo 122, and we can verify that by running sudo cryo minus minus version. And here we can see that we are on 122.0 and that should be allow us to enable or disable the new configuration option. Since cryo 122 we are now able to enable or disable certain metrics. So this means if we run cryo and look for the CLI arguments, then we can see new option metrics collectors, which are defaults to all metrics being enabled. So now let's just limit the metrics to enable the image pull successes for cryo. And if we run cryo with enable metrics and only selecting this single metrics collector, then we can see that the cryo looks already indicate that we are skipping a bunch of metrics and that we are also serving those metrics, but we have enabled the actual metric we want to have. If you now put a container image, for example, the Fedora minimal image, then this will take a certain amount of time. But in the end, we can now check for our matrix and then we grab for cryo image pulls and then we can see that we successfully downloaded the image and that the image counter is correctly at one. Now let's clean up cryo again. Cryo's metrics endpoint can also be secured by using TLS. So if we now choose a valid certificate and key on disk, then we can specify that by minus minus metrics and minus minus metrics key. And if we did that, then we can see that cryo is now metrics serving the metrics via HTTPS. And to look for our metrics, we can just use curl and we have to skip the certificate because it's just a set sign one. And then we can see that we get all the metrics. So if the certificate doesn't exist at all, then cryo will automatically generate the default key pair for us. So if we, for example, define a metric certificate and key which doesn't exist on the local disk, then we can see that cryo is now creating the certificates for us and that it's also watching those files for changes. This means if we now modify this file, for example, we put an invalid certificate into this, then we can see that cryo picks those changes up, but it tries to validate the certificate and this fails in our case. So it will not do anything at all. There are also a couple of new metrics which had been added to cryo. Every metric contains the prefix container runtime cryo, which allows us to easily identify the metrics served by the container runtime. The first thing we have to add is the containers OMtotal, which gives us the total amount of containers killed because they run out of memory. The second metric we have are the containers OMtotal, which contains the out-of-memory killed containers by their name label. We also have a metric for defunctional processes, so now we can identify zombie processes on a node. And beside that, we also have the image-pulse-layer-size histogram metric, which gives us the bytes transferred by cryo-image-pulse per layer as a histogram. I would like to demonstrate that to you. So the third thing we have to do is we have to start cryo by name-apling the metrics. We can do that by running to the cryo-minus-name-of-metrics and here in the logs, we can see that we all have all our metrics in it. Now I'd like to show you how cryo can count containers which get killed by out-of-memory. So for that, we have to run a special container and we set the Linux resources memory-limited bytes to 25 megabyte. We prepared a special container image for that and if we run the container in the sandbox we prepared locally on this, then we can see that cryo will run the container successfully and if this is done, then the container will be killed immediately after startup. So if we run sudo cryo-ctlps and list our containers, then we see that this out-of-memory container already got exited. So the metrics now indicate that the container has been killed because of an out-of-memory event. And if we crap for our local metrics and look for the container's OAM metric, then we can see that we have now one container out-of-memory total of one. So cryo is also able to count the number of zombie processes on the node and to demonstrate that we actually have to spawn a bunch of zombies, but for that we have to create a little helper to it. How to create a zombie process? So we can just run something like exit command ls and then start the command. And after the start, if you don't call cmd.wait, then the process will reside in zombie state. And this is what we do in this program. And if we run the little tool, then we can see that the cryo metrics now indicate that we have a defunctional process. Cryo requests the amount of defunctional processes on a node directly when we request the metrics. So this means that we also get a warning that we have now a defunctional process, ls on the local system, and I prepared that so we have two of them. So we can distinguish them by pit. This means that we have now two defunctional processes on the node and cryo is able to collect them. And cryo also features the new histogram metric for the transferred bytes during image boils. So for example, if you pull an image which contains many different layer sizes, then this usually takes a certain amount of time. But cryo is able to log the percentage, for example, for pulled container images. We can see that we are almost done here. And then cryo applies the images to the local system. And if this is done, then we can investigate the actually pulled bytes by the summary count and buckets. If we grab for the image plus layer size metric, then we see that we have on one hand the summary of overall bytes which have been transferred. We have the amount of layers which are 20 here. And we have a distinguished buckets for the histogram. So we have different layer sizes which are lower than a certain amount of bytes. And then the histogram will show up how the bytes got transferred over the overall medium. So that is what we had on what's new on cryo. We continue to make improvements and we'll have more updates for you the next time we meet. Check out our installation and setup guides in our GitHub and get started today. Issues and PRs are always welcome from any and everyone. You can also find us in the cryo channel on the Kubernetes Slack. And now we can take any questions you have. Thank you.