 Good morning, everyone. Thank you so much for joining this session. I hope you're enjoying the CubeCon. My name is Sohan Konkerka. I work at Red Hat as a senior software engineer. I'm working on cryo and signal specific projects. Today, along with me, we have... So I'm Julien Ropé. I'm working for Red Hat as a software developer. I'm also a contributor to cryo and I am involved to the Cata containers and the confidential content of projects, too. So today we want to talk about what happened in cryo in the past months following our graduation within the CNCF. We will talk about how the project behave at the community level here. So this is showing the number of stars that we have on the GitHub project. This is a steady incline, as you can see, over the years. It is showing growing interest into the project and follows all the work we do to keep having more contributors and contributions to the project. As part of it, the work we are doing to get more contributors include working with mentorship programs, with, like, the LFX mentorship, so we had successful contributions following that. So, yeah, we see a steady number of commits and contributors going on at the community level. So we can say the project is in good shape and is seeing attraction. Next, so just a glimpse of what we are going to release in the next release, 1.30. So this is just a small number of the things that are going into that release. So first, we are going to have an easier way to deploy second profiles using OCI artifacts. We're going to release support for the S390 architecture and the enablement for split image file system, which is something that Sohan is going to talk about a little more later on. So also, we added the ability to give time zone information for the pod and containers to for them to be able to process the local time and more internal is an instrumentation of calls with the NRI plug-in. So it's just a way to give visualization of internal processes within the cryo components. So this is it for that. OK, so more technical projects. So this is we want to talk about what we did to integrate confidential containers support into cryo. So confidential containers is about running your workloads in a trusted environment. And we are doing that by using a virtual machine, which is running on a hypervisor with the ability to encrypt the memory while it is in use. The goal is that while we already encrypt the disk, we can already encrypt the network connections. Now with confidential container, we can encrypt the memory while it is in use, sorry. And so even an administrator of the host cannot dump the memory and see what's happening in there. So that's the whole purpose. So that's why I wrote there. We don't even trust the host. We have to build that trust. So in the next slide, we can see how we build that trust. Yeah, so this is a drawing showing a high level, how it is working. On the left hand side, you have your regular worker node with Kubelet cryo. And at the bottom, so in place of run C C run runtime, we have the Cata runtime, which is responsible for creating a VM in which your container is going to run. At the center, you see the trusted VM. So it is running on a hypervisor that has hardware support for all the certification attestation. This is so typically it is TDX for Intel. It is SAP S&P for AMD, so things like that. The goal is that when the VM starts, you have this attestation agent that is going to run and verify that everything that runs from the host CPU up to the VM image itself is genuine and has not been altered. So this is done through certification. You have the relaying party separate from it. Key broker services that helps you validate the certificate that the CPU is giving you. And so the first thing that the attestation agent does is verify all that stack. And if anything is wrong, it just shut down the VM. So your VM, your workload, your sensitive workload is not even downloaded yet. So it is safe. If something goes wrong, it is not touched. If things are validated, then the image management kicks in. What happens is that the agent is going to talk to the registry, take the image, validate it again with certificates, with attestation, and then decrypt it and then run it inside the VM. So this is how confidential container works. The way you see it here, you could say, well, the VM is autonomous. It doesn't need anything from the cluster. So why do we need to change cryo for that? And the answer is, in order to download the image, we have to know which image we want to download. And that's an information that only the cluster knows. So we have to give away for the cluster to give that information to the agent somehow. So this is the next slide. Thank you. So how do we do that? Our first attempt at doing it was to try to relay the pool image request from Kubernetes directly to the agent inside the VM. We tried to do that in a proof of concept implementation for both container D and cryo, but we found that it was very invasive. The reason for that is even at the CRI level, you can see that there is a clear separation of workflow between the image management and the runtime management. And it has an effect on the implementation because then in cryo, you have clearly separated code pass. For image on one side and runtime on the other side. So when we tried to relay the pool image request, we had to make bridges between parts of the code that were not designed to talk together. And we were modifying the workflow even for regular container, which was a bit a bit big change for something like that. So we had some pushbacks and decided to change our mind. And now we are doing something different where we modify the create container request by adding some additional options to it in the mount options of the create container request. And we do it just while it is being processed. So the idea is to add the small set of information here. You see it is very simple. The first line is just telling the agent, you have to pull the image, actually. And the other one, the second one is giving it the idea of the image to download. The rest of it are optional information that we may use later. But for now, for this feature, the two first line are the only one we need. So this has been implemented in container D using the NIDUS Snapshotter. So Snapshotter plugins are a way for container D to be able to prepare a snapshot for the container before pulling the image. And what the NIDUS Snapshotter does in the confidential container situation is to pretend that there is a snapshot and so that container D doesn't pull the image. But instead of preparing it, it is just adding that information. So from container D point of view, there is no change inside the code. It is just the plugins that modify the behavior. Now, how do we do that in cryo? The fact is that in cryo, we don't have plugins. We don't have the Snapshotter plugin. So we can just take the Snapshotter plugin from container D and put it into cryo. So what we did instead is modify the create container request as we are processing it. This is a much smaller change compared to relaying the pull image request because we have a clear separation of code paths between the regular runtimes and the Cata container runtimes in cryo. So the chance that we did is just modifying the behavior for Cata VM-based runtimes. So that is a change that is already done. It was released with cryo 1.28. And now the other thing we need to do, which is a work in progress, we have to tell cryo to not download the image on the host because cryo is still receiving the pull image request from Kubernetes. So it wants to download it, but it is a waste of time because waste of resources because we are not going to use that image anyway. And it also can cause an issue because in confidential container, we may have encrypted containers image and we don't want to share the decryption key to the host because we don't trust it. So if we do that cryo will pull the image and will not be able to prepare it. It will fail. And if it fails as a pull image request, Kubernetes will not go further and we will never create the container. So we have to teach cryo to not pull the image. This is still in progress. So what's next on the confidential container project for cryo? So we have an alternative way of dealing with images. So you understand that the way I explained it before, each VM will have to download its version of the image. So it means that if you run the same container multiple times, you will have to download the image for each one of them. That's again a lot of resource being used. So we want to provide a way to scale that and it is always a trade-off in security. But here we want to trust the host just a little bit by letting the host download the image and share it via a volume to the various VMs. So we will have to put in place some encryption. We have to make sure that the image can still be validated to make sure that it is genuine. But this is something that is being done as a confidential container level. It is still being worked on. But when it is ready, we will have to integrate that into cryo to offer the two possible options. Yeah, just a word about KEP. The idea we have is for longer term, trying to make a Kubernetes announcement proposal to try to integrate that workflow a little better into Kubernetes. Because today, we are more or less hacking into the create container request. We would like to make it a little cleaner. I have this KEP noted. It has nothing to do with confidential container. The reason why I am putting it here is because it is a KEP that is introducing a link between the image spec, the image identification and the runtime that is going to use it. And this modification is actually helping the gap that I was talking about before because now there is a link between the image and the runtime even at the Kubernetes level. So if this KEP goes on, it will actually help us on the confidential container level to do what we were trying to do. So it's probably something we want to do in the future, trying to make our own KEP to be better integrated. Okay, so talking about KEP. So this is a signaled KEP that we are doing about CRI stats and metrics. So we did this some time ago and it is about modifying the CRI implementation to make the metrics from pod and containers reported by Kubernetes have all those metrics come from the CRI rather than coming from C-Advisor. As it is today, the metrics that Kublet reports are coming from both sources. Some from C-Advisor, some from CRI and first it is a duplication of efforts. But it is also kind of confusing about where the metrics are coming from. And it also requires C-Advisor to be aware of the runtime. So there is a dependency of C-Advisor on the runtime being used, which is counterintuitive because C-Advisor is a higher level component while container D and CRIO are just above it. They are talking to the runtime because they are running it. So it makes much more sense for them to repost the metrics. So we have this proposal to modify it. So this is how it is today. So you have those APIs at the top. You have the C-Advisor resource and summary APIs which are giving metrics and stats for pods and containers. And some of them are coming from C-Advisor and some from CRI. And at the bottom you have C-Advisor being used by Kubelet for the node level stats and the eviction manager. What we want to do next, thank you. So what we want to do is to keep the same API because a lot of people are relying on them so we don't want to change them. Their content will be exactly the same but their source will be different. Everything will come from the CRI. And at the bottom, this is what we are not changing. Everything for the node level stats and eviction manager are still relying on C-Advisor. We are not modifying that. We just want the pod and container metrics to be provided by the CRI. So status updates about that. We want to have this implemented in the next release, 1.30. We still need support on the container this side. And when we have that, we can move the KEP to beta. So that basically means that the feature is there. It is available. It is feature-gated but people can start using it and hopefully to see that there is no change because the goal is that the same metrics are available in the same places. So that's it on my side. And now, Sohana is going to continue with other signaled initiative and an idea of what is coming next in the future of CRIO. Thanks Julian for setting the stage. So before we delve into the specifics of the split image file system update, I want to talk about the common pinpoints in the world of Kubernetes, the constant battle against out of space, disk space. When provisioning a node, we intend to give adequate amount of storage for containers and for storing container images. Traditionally, container runtime uses a VOD directory to store either in a separate file system or within the root file system. And this is where the problem comes, where you might get some scarcity of disk space. In the context of this split image file system in Kubernetes 1.30 is a game changer. It's an alpha and state currently and it uses an approach where it separates the read-only image layer from the write-in layer. What it does, it basically isolates the container at the same time at this, the perennial issue of disk capacity. However, currently there is one limitation. When you separate the container runtime file system, there's a write-able layer on the file system presently. If one tries to write to volume or fmrl storage, then they try to write to node fs. So in that way, you will have two write-able layers on two file systems. And this cave doesn't address separating the write-able layers from the node fs. This predominantly mentioned for image fs. Now let's discuss about the container runtime file system. It has two layers, read-only and the write-able layers. Read-only contains image layers where you can spin up the containers without altering the images. And write-able layers are basically writing the data on the disk. Combining we have image fs. Now, if you want to use this feature, you want to configure etsycontainerstorage.con where you can see on the slide, there's a configuration. For temporary location, you need to use run root and for persisting data, you should go for graph root. However, there's one caveat. For some reason, for graph root, you won't be able to persist the data. You might need to change the SLNX label based on what is there in Borlip container storage. So why this is important? As I described in the previous slide, it not only solves the problem for disk space, it also gives you and provides flexibility in terms of how you want to configure your Kubernetes cluster. Looking ahead, the future looks where we are trying to improve the eviction policies. At the same time, we want to have a clearer fibrous storage reporting. Since it is going beta in 1.31, we are also planning to have some runtime configuration options. Here's the blog written by one of our Red Haters. If you want to learn more about the technical aspects of it. Now, let's shift our gears and talk about some new frontiers in the container and time. We are not just talking about some incremental improvements, we are talking about unlocking the potential. Let's start with Wasson. Wasson is a web browser technology. It's a powerful technology which has gone beyond web browsers to something like client and server applications. WebAssembly in a nutshell is a fast secure, stack-based virtual machine designed to run binary code without knowing the underlying host and OS resources. Now, in this case, I want to see the benefits of Wasson with Cryo in Kubernetes. So I want to give you some real-time use cases where we feel we might want to target in the future. Since Kubernetes is a popular platform for addressing some of the edge computing use cases, with Wasson integration, we are trying to see some agility in terms of enabling multiple architectures. At the same time, Wasson binary is compact, so we are thinking about having some lighter deployments. Imagine a use case where we have a lightweight microservice deployed on Cryo containers within the Kubernetes world on different devices. The second point is dynamic scaling. Kubernetes provides dynamic scaling with the help of VPA and HPA. Here, we want to use Wasson integration in a place where Wasson binaries are compact, which utilizes less disk space. At the same time, there's a rapid startup time which can be utilized to optimize the resource efficiency. Since we talked about microservices, security is one of the important aspect of it. So let's talk about security-enhanced microservices. Kubernetes, as I mentioned, provides security for microservices with Cryo's container security. Wasson is like a cherry on the top, where it not only provides a sandbox execution environment, but also gives you model signing for code in order to maintain the integrity and authenticity. At the same time, allow users to tweak runtime configuration without knowing the underlying OS. Last but not the least, Polygot microservices architecture support. Kubernetes has hundreds of microservices support. With Wasson, we are trying to see the potential in adding Polygot programming language, where it not only helps developers to have different programming language developed with the microservices, but also enables the developer toolkit for it. Now we know the benefits of it. Now our journey started to integrating Wasson into Cryo. Every journey has some hurdles. So let's talk about the roadblocks. Cryo as a high-level runtime supports two OCR runtimes, Seren and RunSea. Seren, RunSea being the popular lower-level runtime in Kubernetes doesn't support Wasson workloads. It doesn't have the native integration with Wasson runtime. And that is where Seren comes into picture. Cryo does support Seren, and Seren has a way to run Wasson workloads with the help of an image annotation. As you can see on the left-hand side of the picture, you can use the image annotation. However, there is a caveat. Passing image annotation down to OCR runtime is not an ideal way. When you pass OCR image annotation down to runtime, it might pose some kind of security loopholes where image might try to configure the runtime in a way that basically gives some kind of security stuff. So in order to address this problem, we treated images with Wasson as Wasson. But before that, I want to talk about the right-hand side where what kind of problems we face while integrating with Cryo. So in Cryo, when the pod gets created, we try to assign the runtime class. For a normal workload, it works fine, but when it comes to Wasson, if the platform of the image is unknown, then that would definitely have some problem. And that's where we come up with the implementation where we treated Wassy-Wasson as Wasson by default. And we also introduced a field called Platform Runtime Paths under Cryo Runtime config where you can actually mention platform architecture and the corresponding runtime. Just a note here, Ciren-Wasson, under the hood we are using Wasson-H as the Wasson runtime. Now, with this, I want to show you one demo. And this is still work in progress. There are a couple of patches which are still not upstream, but I want to give you some overview as to what we can achieve with the help of Wasson integration. So what we are doing here is we are trying to run AI models within a Kubernetes pod with Wasson integration. So on the right-hand side, I'm trying to spin up the Cryo instance, but before that, I want to show you the Cryo config. So as you can see, let me just stop. So in this case, I'm using runtime environment. So you can see in the Cryo config, I'm using Wasson-H plugin mount that is required to run AI models. So I'm using this plugin mount. And as I said, Platform Runtime Paths, we have Wassy-Wasson as the given runtime. So with this configuration, so with that configuration, I'm going to spin up the Cryo instance and I'll be using that runtime to spin up the cluster. I'm just forwarding a bit. So here I want to show you the pod spec and before that I want to show you the image and the container file that I'm using. So here's the container file. We usually known that as Docker file. So in this case, I'm trying to set the environment variable which is required to run AI models. The first one is the plugin path which is required for Lama H. And the second one is the plugin that we required to run, I mean the model that is required. And I'm trying to run a Lama chart for some. Now I'm going to inspect the image that I've created out of that container file. And I'm going to show the platform and architecture for that image, it's Wasson-Wasson. Now let's see the pod spec for it. So here I'm simply using image and I'm trying to mount them volume where I'll have the respective model, AI model. I'll go ahead and create the pod and the pod got created. Let me see the logs for it. Now as you can see the model started running. I'll attach that to a container and I'll start conversing with the pod. And the first question I'm going to ask is what is the capital for North Carolina? Peter, you want to guess? All right, I think we can fast forward. So under the hood we are using Lama H, like Lama model. And since it's an experimental one, you can get the proper answer, but yeah. All right. So with this, let's transition to our next topic which is hard enough, pod man in Kubernetes. Here, what makes pod man so special in Kubernetes is its way of running containers. Here, we are talking about pod man integration. We want to talk about the concept that pod man employs that is running rootless container inside the containers. So in order to achieve this, we want to actually help take help of couple of upstream Cape that we're talking here today. With rootless containers, we're not only trying to solve the security aspect of it, but also helps in mitigating some of the issues that we are facing for a long time. So the first Cape I want to talk here is Procmod option. So before I talk about Procmod option, I want to talk about what is Proc. It gives you a system resources information and Kubelet basically instructs container runtime to mask certain parts under container under host. So containers, the information for hosts, for containers shouldn't be visible. Information for the host shouldn't be visible for the containers. However, for running the unprivileged containers, we need to ensure that certain parts under proc should be visible. And that is where this option comes into play where the moment you mentioned under security context, proc mount unmasked, you'll be able to run unprivileged containers. The second missing puzzle for this is username space pods. Here's the GitHub issue for that. What username space does is it isolates the security identifiers for enhancing your security. So you want to run few processes inside the pod that will have different set of ID and GID compared to the one on the host. This will help us in running unprivileged processes on the pods, which might be privileged processes inside the pod, which might be unprivileged on the host. And this will definitely mitigate the security because even if the container breaks out, it will still be there under that same username space and will not hamper the host specific information. Now I want to talk about the examples. So as you can see on the first two left two examples, it's about running privilege pod with rootful and rootless containers. In an interface ready Kubernetes, those use cases are very important, but getting privileges something very difficult. And that is where we thought of using unprivileged pod with rootless containers. As you can see in the pod spec, we are using annotation, which is supported in cryo releases. And you need to add SLNX options with type container ingenty. So this is a new label that we have added under container SLNX. And you can use this one for privilege SCC. Here SCC stands for security context constraint, which the administrator use for setting the permission for pods. The future work for this is to add custom SCC for running container ingenty label. You can also use username space pod. As you can see on the left side, the pod spec that we have. You need to have annotation. Then you need to mention run as user. Then in order to use username space, you need to put post users as pods. And on the right hand side, we have the cryo config that would be utilized to run unprivileged pod with rootless containers. This setup is still in flux. We might see some changes, but if you want to run this as is, you need US container SLNX package with container ingenty label. And we should also have kernel with IDMAP mounting. And since this is an experimental feature, this is supported in Kubernetes and cryo releases starting with 1.29. So future, like we are excited about cryo's feature. There are a couple of initiatives we are taking. You can definitely go through that QR code and see the future roadmap for cryo. Here are the highlights where I'm going to talk about this automatic reloading of mirror config registry. So there's a project called Spiegel. They want to add cryo support into them. So from cryo side, we already started working towards this particular feature. We also want to implement address space and our framework. We talked about wasm. Wasm is a big thing now. We want to add wasm plugins directly loaded into cryo. And last but not the least, want to add support for free BSD. Yeah, with this, those who are new to cryo, try out, we have GitHub, we have Slack. You can scan the barcodes. We want to hear from you how you like it. You need any feedback. We are here to address those. And with this, that ends the presentation. Thank you so much for your attention. And we are open for questions. Hello, hey, I've heard about your talk about the confidential computing part. And I've opened questions about that. The first one, where are the VMs exactly running? So the VM can run basically the reason for confidential container at the beginning was to be able to run your workloads in public clouds. So the idea is to be able to run your VM in any hypervisor provider and still be confident that nobody can access it. So this was the root of the project, but you can actually imagine doing that on bare metal too. And if you have the supported hardware, have your cluster run with TDX or SVSNP and run your VM inside it also. It works both ways. Okay, so it means it doesn't have to run on the worker nodes. So it can be run on a completely different machine. Yeah, totally. You don't have to run the VM on the worker node. It can be elsewhere. Okay. And how do you ensure high availability? That's a good question. I don't have the answer for that, to be honest. I am working on the virtualization side. So that's a question that I think we should ask to the other members of the confidential container team. They are around you actually. And we also have a talk at 2.30 this afternoon where you can find them. So if you want to ask questions more related to confidential containers, I think this is where you should go. Okay, thank you very much. No open questions. So I think we are out of time. So thank you very much.