 Hi everybody, thanks so much for joining today and welcome to Open Infra-Live. Open Infra-Live is an interactive show sharing production case studies, open source demos, industry conversations, and the latest updates from the global open source infrastructure community. This show was made possible by the support of our valued members, so thank you so much to them. My name is Treva, joined with my co-host Dogo, and I will be your host today. Well, we both will. We're streaming live on YouTube and LinkedIn and we'll be answering your questions during a Q&A period at the end of the show, so feel free to drop any questions in the comment section and we'll answer as many as we can. Today we're here to talk about confidential containers and Cota containers, the hot container runtimes that offer the security of a virtual machine with the speed of a container. And in today's episode, Zvanko Kaiser from NVIDIA will be sharing insights on confidential containers specifically, the security challenges in container containerization, and how sandbox environments have involved how they evolve with the use of Cota containers. Welcome, Zvanko. You want to tell us a little bit about yourself? Sure. Hello, everyone. My name is Zvanko Kaiser. I'm with NVIDIA from the Cloud Native team. Our team is mostly responsible for all things related to cloud-native technologies here at NVIDIA, and we are responsible for enabling the GPU in containers and Kubernetes. Lastly, I've been more focused on Cota and confidential computing, which also resulted in working with confidential containers. Awesome. So for the folks in the audience that aren't familiar with Cota or Coco, if you were to explain either or both of those technologies to them, say if they were just like strictly open stack, how would you explain it to them? Well, we would firstly need to start with what a container is. And simply speaking, a container is nothing else than a process with some specific views on resources which are separated by namespaces. So every container has a specific view, let's say, on PIDs, on mounts, and other resources like CPUs. The container is, simply speaking, a packaging format for user space applications, and this container can then be run, let's say, on a Kubernetes platform for scheduling and stuff like that. So the other technology that we need to explain is here, virtual machines, which is a virtualization technology, where we're essentially running a separate system inside of another system. So what Cota is, Cota is essentially using a container, packaging it in the VM, and running this container in a VM as a second line defense of isolation so that, because many of the containers have the problem that they're sharing the very same kernel, and container breakouts can take over the complete node and even a complete cluster. So what makes Cota special is that you're using or running a container inside of a VM as a second line of defense. So if you're worried about, let's say, container breakouts, you can use Cota to isolate it in a VM. And if you're worried about VM breakouts, you can even extend this to use a confidential container, which uses encryption technology to protect the VM on the host from the infrastructure, from the administrator, or from other VMs. Neat. So that would be for cocoa trusted execution environments? Yes, trusted execution environments, which is essentially a hardware technology, which is really not new, but it evolved to a more acceptable solutions in the last year with AMD, SMP, or Intel TDX. On ARM, we had trust zone. And if you look at mobile devices, every mobile device had trusted execution environments for paying systems or vaults where data, which is confidential, is secured. And along those lines, we also have sandbox environments that were evolving from time to time, especially because if you're looking at containers, you want to fortify the isolation between the container runtime and the kernel. So there were things like unikernels or IBM NABLA, which is essentially a unikernel as a process to reduce the syscall, which is essentially the attack surface between a container and the host. And then we have things like G visor, cube bird, firecracker, and of course, Cota. And all of those technologies are essentially fortifying the isolation between the container runtime and the host operating system. They're doing it in different ways. If you're looking at G visor or IBM NABLA, they are filtering on syscalls, whereas solutions like firecracker, cube bird, and Cota are relying on half the virtualization, accelerated half the virtualization. Awesome. Neat. So I heard you mentioned G visor. And that was a question that came up during KubeCon. Sorry to throw this one out randomly. But I'm just curious. Can you run Cota containers in Windows with G visor? And if not, is that on the roadmap? Cota containers in G visor on Windows? No. We'll come back to that question. Yeah, because G visor is essentially a user space kernel, and it's process-based isolation. And you cannot run a VM in G visor. So it's just process isolation, and this does not fit together as technologies. Okay. Thank you for answering that. I'm sorry for throwing that out randomly, but it just popped into my head. No worries. Awesome. But thank you for playing along. Yeah. So I kind of did this myself at the beginning of the intro. There's a lot of crossover between the Cota containers and confidential containers communities. So are they basically the same project, or are there some key differences between Cota and CoCo that you could point out, or are they just like peanut butter and jelly, like their partner projects, or? You can say that they are partner projects, but for me, confidential containers is more of a umbrella term, because the backbone of confidential containers is really Cota. And in confidential environments, there are several other steps that you want to, after you're spinning up your VM and your container. And such things are like attestation. And confidential containers provides tools and guest components for Cota to do attestation. And attestation is simply the verification if a system is running with the proper software and proper versions that you are expecting. So essentially, it's a simple pipeline of, okay, I'm expecting specific versions of the software. Does my trusted compute base, which is Cota plus VM image kernel and stuff, are there in the right version? And then give me back a report where on which I can decide if I want to deploy my workload or not. So essentially, in short terms, Cota containers is confidential containers is Cota plus some attestation components that we need to run confidential containers. And the nice thing is about Cota and confidential containers is we have a broad support on many hypervisors. And we are the only software that supports all major trusted execution environments, be it AMD, SMP, Intel, TDX, ARM CCA support is coming up. And risk 5 is also something that we are working on. Oh, that's amazing. That's great. Because part of the draw for myself, at least, to Cota containers was how platform agnostic it is. So it's exciting to hear that that is being expanded to more agnosticity. I'm not even sure if that's a word, but I used it. So, all right, go ahead. No, nothing. Okay. So another question that comes up fairly often or very often is what's the difference between Cota containers and Dockers? If you're looking at it at a high level, then there's no difference. Because it's essentially a container runtime which runs a container on a specific host. But then looking deeper, you are seeing that Cota uses virtualization in the middle. And Docker is running a container on the bare metal without any virtualization. Well, you can say that the user space packaging is some kind of operating system virtualization because you can take a container and run it on every operating system, be it Ubuntu or Red Hat or whatever else. But as a Cota container, you still have the virtualization middleware in between where the VM is plugged in. The nice thing about Cota is and also what draw us to Cota is that Cota is CNI, CRI, CSI compatible, meaning a user running a Cota container or Docker container or any container image is completely transparent to him. All the commands that you're used to like Node CTL Docker are just working with Cota or any other Docker container. And the same applies also to Kubernetes. Anyone using Kube CTL or Kube Cattle, I don't know which pronunciation is the right one. But you will not see any major or you will not see any differences between a port which is running with the Cata runtime and a port which is running with container D. Fascinating. Okay. Thank you so much for the in-depth answer. I don't even want to go into how I've been explaining the differences, but that clears it up significantly. So thank you for that. All right. So I heard you mention, so first off my question or my first question is what is your favorite operating system? Because it's very important. That would be Linux and this is really broad. So I'm telling you also my favorite distribution, which is Gen2. I've been using Gen2 on and off since the beginning when it was firstly developed at IBM many, many years ago. But yeah, I'm doing everything on Linux with Gen2 and it works pleasantly fine compared to 15 years ago. That is hardcore. Gen2 is serious. Wow. Okay. And I phrased that question wrong. I should have assumed that it was going to be Linux, but yeah. Yeah, distro. That's cool. A second favorite because Gen2, just in case that's a little bit too... Okay. Another question. If someone wanted to get into Linux using Linux as their primary operating system, where would you recommend that they start? Like most folks start with Ubuntu, but they may not be, I don't know, comfortable with dev systems. So what would you recommend? Something which is easy, installable and maintainable. And then from there, if they are gaining more and more experience, they will automatically get into things where, okay, I just want to know everything about Linux, then somebody's looking at our Linux or Gen2 or Manjaro Linux or stuff like that. So I think they are going to hop between distributions anyway, because they are maybe getting fed up how Ubuntu is doing things, then they may be switched to Fedora and then they are fed up with RPMs. And there are always reasons to switch from distribution to distribution. But as a starting point, use something that is easy installable. When I started, I wiped out almost six gigabytes of mp3 back in 2003 or 2004, because I didn't know the difference between a logical partition and a primary partition. So yeah, it really depends. So use a blank hard drive to experiment. Do not use it on your production data. Yes, very smart. Very true. Awesome. Okay, so let's get back on topic. We're here to talk about Kata, not about distros. So what's the difference between a GPU container and a regular container or a vanilla container? I'm trying to phrase that. And will it be some use cases for sticking with a GPU container? There's no really a difference between a GPU and a regular container. You could also call a regular container a CPU container, because you are doing the number crunching on the CPU. But in the end, a container relies, so it is the user space pod, and then it relies on some hardware that you are using. And in one case, you're using just a CPU to do some number crunching. And in the other case, you are using just the GPU to do number crunching. So from the technology point of view, there's no really difference between a GPU and a CPU. The only thing may be only that you are using different software to access different accelerators, like in the case of the GPU, or the Nvidia GPU, you would use CUDA. And for the CPU, you can use almost any language. You don't have to have used any middleware or any libraries, too. Well, you may use other libraries if you want to do x-rated computing using SIMD instructions. So in the HPC use case, you have often optimized libraries. But in the end, a GPU container does not really is a difference between a CPU and a GPU container. Fascinating. Okay. So when I think about a CPU, I think about the little thinking rock that you put on your motherboard and a GPU, that's like a graphics card, right? Is that like an accurate way to think of each of those things? If you look at the GPU, you can imagine a GPU having thousands of those little CPUs. And you often run algorithms that are embarrassingly parallel, where each of the data does not is completely independent of the neighboring data. So imagine you have a picture and you have 1000 pixels pixels, and you want to shift or to do anti-aliasing or do some number crunching on a specific pixel to move it from redshift to a greenshift in color, right? On the CPU, you would go for each pixel. Well, depends if the CPU has hyperfitting or not. But essentially, you would go for each pixel, you would do the calculation for each pixel sequentially. On the GPU, you could do this in parallel for every pixel at the same time, which is called SPMD, which is single program, multiple data points. So depending on the size of your GPU, you are very fast on doing the calculations because those calculations can be then massively parallel of thousands of little, let's call them CPU cores. The architecture is different because the memory hierarchy is different. But essentially, thousands of small CPUs working simultaneously on each data point. Okay. So I was completely wrong. But that's fascinating. I didn't realize how come or I hadn't really given a whole lot of thought to how complex a GPU would be. But thank you for like a little bit more explanation. Cool. So yeah, I heard you mentioned something about sandbox environments a little bit earlier. So I'm not even going to go into my explanation about what I think about when I hear the word sandbox, like usually like a test environment. Could you explain to us a little bit of the history of sandbox environments in Cota and Coco and the importance of them? Because I'm familiar with Dragon Ball sandbox. But I guess I would like to hear more in the audience, certainly would like to hear a little bit more about it. Right. As I mentioned before, people were always looking into fortifying the isolation between a container runtime and the host operating system. And way back, people were thinking about using unique kernels, which essentially running workloads merged with a kernel in the same memory space. So that you cannot essentially change the workload because it's running as a simple kernel. And this is the only thing that you can run stripped down to the bare minimum without any additional libraries. So this simple unique kernel was only running the workload which was compiled with the kernel libraries and the kernel. And what IBM did is using this unique kernels as the scheduling unit and built the project IBM LaBla, which is essentially a small unique kernel running as a process. Additionally to this unique kernel, which is doing only one thing at a time, they introduced a layer between the container runtime. They are building to running those unique kernels as a container on this system to filter on syscalls. So the kernel had a limited attack surface because the syscalls are essentially the things that are providing the container functionality. And if you limit your unique kernel to a very, very small set of syscannels, your threat or your attack vector simplifies a lot. So it's always the goal to run as less as possible code and interfaces between a container runtime and the host operating system. And this IBM LaBla OCI competitive runtime system was run in C and they even had some Kubernetes adoption there. So you could run a unique kernel as a process inside, as a container inside of Kubernetes and also on bare metal. And there was also Google for their workloads and services. They implemented like GVisor, which is a small kernel running in user space. It's not tied only to one unique kernel or one workload. They want to run any workload. And GVisor essentially re-implements all system calls in user space so that they can do filtering or more advanced functions and security on those syscalls as well. So what they are doing is reducing, again, the attack surface between the container runtime and the host operating system with their own implementation of system calls. Not all system calls are implemented and a re-implementation of syscalls can take also some, induce some overhead if you have syscall intensive applications like IO storage networking. So people were also looking into other ways to fortify this isolation with hardware support. And the logical solution to that after that was to use AVM as an intermediate layer to protect isolation with the help of hardware accelerated instructions. So and there are two major projects that I know of which are used. One is Qbert and the other one is container. The difference between Qbert and Cata is that Qbert is running a VM in a pod and Cata is running a pod in a VM. If you're looking at Qbert, it not always fits in nicely into the cloud native space because you cannot run easily a container inside of Qbert on a Kubernetes cluster. You always have to bake your own VMI image or you're installing a Docker runtime inside of your Qbert VM which is running in a pod. But that still does not give you the nice feeling of really using like kubectl to execute this very container inside of Qbert. Whereas in Cata with the VM and all the compatible CNIC or ICSI interfaces that I mentioned before, it's a completely transparent experience on Kubernetes or on any other container on time. And there for an IEWS, for example, also went the path of virtualization. They built Firecracker which is a micro virtual machine. It's a minimal operating system and a minimal set of emulated devices. Again, an intermediate layer with virtualization to protect the the hardware from containers and other attack vectors. And you mentioned Dragon Ball before. So this is also a new micro VM that the upstream community is working on, completely implemented Rust. And the Rust runtime which goes hand in hand with that is also one thing that the upstream community is working on. So even they are reducing the attack surface and using a language which is by default more secure, let's say like C++ or any other language. Awesome. I was just about to ask you a little bit more about the rustification that's happening with Cata containers. It started at the 3.0.0 release, correct? Yes, something like that. Yeah. Awesome. So aside from the obvious, what was it that inspired the translation or the transition from GoLang to Rust for Cata containers? And is that going to be carried over into Cocoa? So Rust by default is a more secure language than GoLang. I'm not going too deep into the details of Rust versus Go because I'm going to start a flame war here. But people can read it up. And from the Rust perspective, it has more functional features, functional programming features that people are using to implement stuff, which also makes it easier to implement let's say threaded programs if you're using functional programming constructs. So if people are familiar with Haskell and know what a monad is, they're going to see a lot of similarities between the functional programming language Haskell and Rust. And there are a lot of papers describing how functional programming can help in security because you don't have no side effects. Well, you can introduce in Rust also side effects, but if you're removing side effects, you can program software more performant and more secure than in any other language. Fascinating. Okay. So some of the during the mentorship program some of the I guess like pushback we received some of the feedback we received was that GoLang was the more user approachable version or the more user approachable code versus Rust, which was a lot more complex, but was a lot more powerful. So yeah, I guess that kind of reinforces that. And please don't yell at me Rustations or GoFers. I'm just the messenger. Awesome. Okay. So another thing that another question that comes up fairly frequently that I'm also fascinated by is Cata and Coco and it's used in artificial intelligence and machine language. Back at opening for Summit in I think it was June back in Vancouver, a couple of Cata herders gave a demo where they showed where they were running a GPT bot on Cata containers, which was super fascinating. I was so thrilled by it. But anyway, could you tell us a little bit more about I guess current and possibly future implementations or integrations of AI and ML in Cata containers and or confidential containers? Sure. If we are looking if you're looking at the at AI ML, there are three main main pipelines that are executed. One is data exploration. Second one is training. And third is inferencing. And in each of those pipelines, you have personas that may have confidential data that you want to protect. My example to go is imagine you have a model trained on patients data that can dictate, let's say cancer lung cancer or something like that. So if you're doing like this training, you want to protect first the patients data because you don't want that any infrastructure administrator employees has access to this patient data. And then secondly, if you have the model trained, this is IP, this is intellectual property that you want to make money off or even with the model, people could be able to deduce confidential data from exercising, let's say wrong inputs or malicious inputs. So you want to protect also the model when it's in use. So when you're doing inferencing, you want to protect the model. And again, you are exercising inferencing with confidential data against this model. So you also want to protect all the data that's coming in and getting out. And this is where confidential containers can help protecting the model in use, protecting the data in use at any pipeline stage. And currently looking at like large language models, the inferencing pipeline is well simplified. It's a web front front end talking against a API server, which talks then to a inferencing server and some back to database. All of those components can introduce entry points for a attacker to do to to leak data or to do deny denial of service attacks. So we can use, let's say we are worried about container breakouts, so we can use Cata. And then I think about virtualization is that we can have more strict resource requirements or for each of those containers. So if we assign one CPU and only like half a gig of memory to a VM, even if the container breaks out, he will not get more resources than it gets from the VM. So this is a nice way to reduce result to save resources on per container. And things or models or data running where we can extract confidential data like the inferencing server with sophisticated attacks, there are a lot of papers out there how you can craft your inputs to extract confidential data. There needs to be also things like input and output validation, but you also want to protect a model against the infrastructure or your cloud provider or the administrator stuff like that. So you have the inferencing server and then you have oftentimes the vector database, which also it's an abstract representation of the data sources that you're feeding be it PDFs or any other input source that may be confidential, where attackers could also deduce some confidential data. So you want to run this vector database as well in a confidential environment. And this is where confidential containers come into play. So you can run, if it's really confidential, you can run, let's say the vector database and your inferencing server inside a confidential container, and then the web front end and the API server, maybe as a Cata container, because you want to constrain the resources that this container may have. And even if it breaks out, we still have virtualization as a second line of defense. And yeah, depending really on the threat model, you can cascade the technologies that you are using. One thing that I want to mention is also, if we're looking at KVM and QEMU, they are having the very same, I would not say the problem but the challenge that they shared the very same kernel. So a rogue QEMU VM can still do some harm on the other VMs because they are sharing the very same kernel, the same case with the containers. That's why we introduced virtual virtualization inside of containers, so that we can run our own kernel inside of the VM, which makes it kernel and user space independent, which lets you also, where updates to the kernel on the host will not really break the VM and the container inside running on the VM. So KVM is always, most of the time, seen as a type one and type two hypervisor, but it's not really a true type one hypervisor, like Xen or the hypervisor that's running in VMware, where it's only running on bare metal and you have different privilege levels, isolating the VMs more or better than, let's say, KVM and QEMU are doing that. So we are also looking upstream in enabling type two, type one hypervisors to have more control between the VMs, where we are not sharing resources. Fascinating. Okay. So I guess that explains, or I was having flashbacks to when I used to work in public cloud and how much of a lifesaver Kata and Coco would have been dealing with noisy neighbors. It sounds like it would be a perfect solution for that type of environment. Cool. All right. So this is probably something I should have asked earlier in the conversation, but I was just wondering, you mentioned that you are a super duper principal engineer over in NVIDIA, and I'm assuming that that is part of the, or at least part of the reason why you're involved in Kata and Coco because you are helping them to develop some type of awesome new implementation of Kata containers internally. Can you tell us a little bit about what you're doing? If not, then tell me, shut up, I'm going to something else. So no worries. I'm leading the Kata and Confidential Containers effort here at NVIDIA and we are trying to introduce Kata and Confidential. We are not trying, but we're doing it into products where we see that we need more isolation and a little bit maybe to the history how I came to Kata. We need to remind the time at least three years ago where we were mostly responsible or where we are executing the cloud native enablement of the GPU. You may heard about the GPU operator, the networking operator, NFD, no feature discovery, where we are working very closely with Intel on it. So we have all those components to enable the GPU in a cloud native way in containers and Kubernetes. And at this time, we were also looking into isolation techniques and how can we bring more security and stability with the GPU to workloads that are confidential or sensitive to denial of service or theft or whatever else. So that's where we come upon looking into virtualization technologies like QBird and Kata. But it fairly soon crystallized that the way to go if you want to have a cloud native way of enabling a container with virtualized isolation was Kata. As I mentioned before, it just fits in transparently in any container runtime, be it docker, podman, cryo, Kubernetes with cryo, Kubernetes with container, it just fits in transparently. And we essentially do not have to care about which CSI driver we're using, what the CRI is, because in the end, Kata is really CNI, CSI, CRI compatible. And anything that we are using in Kubernetes just plugs in neatly with Kata containers. So I've been working the last two years very actively in Kata, where we added proper GPU support on it. So Kata had got GPU support. We also enabled VGPU, virtualized GPU support. So there are two ways of looking at it. VGPU can be used as a time slice. So each of those chunks of the GPU is divided by time, or you have a more sophisticated split of virtualization of the GPU with MIG, which is essentially a hardware virtualization technology to split up the GPU in distinct compute units, where the compute capability and memory capabilities are fixed. You can think of it as a small virtual machine. So you can use the virtualized GPU also with MIG or with time-sliced virtualization. So we added support for this as well. And to have more sophisticated use cases like GPU Direct, RDMA, or GPU Direct GDS, so for people not knowing what this is, is essentially a peer-to-peer communication on the PCI Express topology without going over the CPU, meaning a NIC to send out data and getting data in can talk directly to the GPU without ever touching the memory of the CPU, because this is a lot of performance drop if you need to go over the CPU. And for GPU Direct RDMA and GDS is GPU Direct X-rated storage. So the GPU can be fed in by a storage which is attached directly to the NIC. So the GPU is talking directly to the storage without ever touching the CPU. And for those more advanced use cases, what we added to Cata is the virtualization reference architecture, which essentially describes how you need to set up your PCI Express topology inside of Cata to make this kind of sophisticated use cases work. I'm not going to in the details, but anyone who is interested can ping me. We can share the documentation on it and that's what we're currently implementing. So the end goal is really to have such use cases running inside of Cata. And the nice effect of this was and is that I'm doing the implementation to go runtime. But with this pattern that we established here in Cata, anyone can implement this virtualization reference architecture and enable CUDA and GPU Direct support. And this is exactly what's currently happening. The runtime folks who are doing the Rust implementations, they are following the steps that we outlined in the virtualization reference architecture and implementing now GPU support in Dragon Ball. So without me doing a lot of work but setting with setting this pattern, they can implement it or if it can be implemented on any hypervisor, be it Dragon Ball, Firecracker or whatever else. So they just need to apply all the lessons learned that we have and they can implement it. So we are currently very actively working on getting GPU support in Dragon Ball. Awesome. Cool. Thank you so much for the in-depth answer. So we're doing pretty good on time, but I want to make sure that you have enough time to do the super exciting demo that is coming up. But I have one more quick question or a couple of quick questions. So first off, I am wondering if Cata and Coco were the first open source projects that you got involved with and if not, which project did you start out with or yeah, which project did you start out with first and what led you over to the Cata-Coco world? I mean, aside from the obvious that we're awesome. And then the second question would be, for somebody out in the audience who's listening in and wants to get involved with open source, something, anything, how would you recommend they get started? And also, what are some of the benefits that you have gotten from being involved in open source? Because a lot of people think, oh, I'm not doing all that work for free. What's in it for me? But there's actually a ton of things in it for you, including helping out the entire global community. But anyway, so that was a lot of questions and a lot of words. But yeah, how'd you get started with open source? Has it been benefiting you? And how would you recommend somebody get started? I need to think about that. What was my first open source project? Honestly, I don't know. What was the first open source project? They've been a lot. So I can really remember I'm, I'm doing this job like for the last 12 years and always involved open source. But the very first project, I cannot remember, honestly. But the benefits would be, of course, from the confidential point of view for confidential containers point of view is that we are not providing security by obfuscation or by hiding anything. It's completely open. Everybody can take a look at the source code and exactly knows what's going on when executing Cata and what's going to happen if you're doing like a GPU pass through inside of a Cata. So compared to the cloud providers, you don't really know what's happening there because it's closed source. They can do audits on it. But then again, who is, you know, auditing the auditor and when are you stopping and how are you going to make the claim that everything is secure? We've investigated ourselves and determined there's been no wrongdoing. We've done a good job, right? So yeah, the last thing about open source is really you can see and verify what's going on. On the other side, it's a giving and taking because you implement stuff that other people are going to need. Let's say GPU support in Cata, but there are a ton of other people that are implementing, let's say the attestation agent or key broker service that we need in confidential containers, which I don't have the knowledge about to implement it in the right way, but the other people can. So everybody's working to a common goal and we are giving, let's say GPU support, but getting like a ton of other things that we need in Cata or in confidential containers. And the other benefit of course is also that you're, well, some say that it's not a benefit that there are so many eyes looking over your merge request or PRs that you are submitting. But in this process, you learn a lot about how to write your code in a way that's acceptable. You're learning a lot on the way from your first PR to your 100s PR. And I'm still, I don't know how many PRs I have made in my career, but they are still always something to nag about because I've done something wrong because I'm not seeing the forest of all the woods, right? So yeah. And you're getting into touch with people you would never meet in the office or in persons. And this also opens up career opportunities, working with a diverse set of engineers, which may refer you in the next time when there's an opening and stuff like that. So I remember like 15 years ago, that no company was talking to the other. It was all closed source. Everybody is working with elbows. But in the open source community, everybody is trying to achieve a common goal. Awesome. Awesome. I'm not sure where I was going to go with that. But yes, that is a fantastic answer. There are career benefits, but also like the difference is in points of view. I guess when you look, I guess when you think about like a tech nerd, or from a tech nerd perspective, you wouldn't think that having those other perspectives would be important, but it is absolutely crucial, especially during that learning period. Because we all learn how to do these things different types of ways. Some of us are self-taught. Some of us have PhDs, and then they're the ones that are in between. So we can all learn something from one another, and open source is a great place to do that. Cool. Awesome. So last question. How would you recommend that someone get involved in an open source project? Like say if they aren't confident in their tech skills, or if they don't know how to code, where would you recommend they get started? There are lots of other options, right? Right. Today, every project has at least a select channel. And they are like general channels where they could introduce yourself and ask for a where they can ask, where they can help out. And there are a lot of maintainers looking at it, and listening where they can help to assign a simple PR to this person. And he can then, you know, which is not time critical, which is not critical in the path of execution, where he could let's say update documentation, update some CI scripts, whatever else. So there's always some, some need to do some simple PRs on that. And then with time to time, people are getting more confident and confident, and then they can take up more work. Or just look into the issues or PRs that are open there. If you think you can contribute anything, just tag you on the issue, ask questions about the issue. There's always someone listening or reach out per slack and start little and then iterate. It's a learning process. Many people need some time to get into open source and how it works, and that's that PRs are looked a lot with a lot of eyes and a lot of iterations, but you will get used to it. So start simple with a simple PR issue, ask in the Slack channel if you can help out, introduce yourself. Let's say someone is, has a PhD in formal methods, and we want to do formal verification of our runtime to see if we are doing everything wrong. I mean, this would be really nice, and people would be eager to see something like that. And then maintainers can create issues, also work with this person to create PRs and explain how we could do that. Awesome. Really wish somebody with a PhD, which I mean, that's awesome. Cool. Thanks so much for the answer. And also, I'm just going to jump in and say that documentation is also a great place to jump in if you're not fans of code. We always need, everybody needs people to write out docs and try out stuff. Okay, so we are at 12 minutes, so if it is all right with you, would you like to jump into your demo? Yeah, let's do a quick demo. It's not really sophisticated, but I just wanted to show the audience how, especially the lift and shift character of GPUs. Let me just see if this is coming up. There you go. Okay. Awesome. This is a note which has the drivers installed, and what I want to show here is, so this is container D runtime that we can use any of those container frontends, be it Nerd CTL or Docker. It's the very same output. So we don't care if it's Docker, we don't care if it's Nerd CTL, we don't care if it's Podman. We can use the very same container and run it on any container runtime. And as well, the very same container or other containers can also be used to run it on confidential containers. So this was also one of the nice premises with Cata containers and confidential containers that we are going to be able to run any workload unmodified on Cata and Cocoa. And what I have here is, I'm just switching the runtime from container D or any other runtime to Cata and pastoring a VFIO device, which is attached to the GPU into the VM. And yeah, don't be surprised that this takes a little bit longer. There's a lot of debugging enabled here, and we are code plugging the GPU. The GPU has a lot of memory. We are resetting the GPU at every invocation, at every invocation, because with each container startup, we want to make sure that the GPU is in a deterministic state, meaning that memory is wiped out, the settings are resetted, and the GPU as said has always a deterministic state to start from. And in a confidential environment, all those memory needs to be properly initialized, a functional level reset is done. And once this container is up, we are going to see some commands where we can get us some information about where the GPU is running, if confidential mode is enabled, which CPU is running with this confidential mode, and that the GPU is set in ready state. What we are doing at startup inside of the VM. So our complete NVIDIA stack that I showed here, which is essentially running the NVIDIA container toolkit, which enables a GPU in the container, we took it, packaged it inside of the VM, and the VM is doing what the bare metal system is doing, so that we can use reusing our stack inside of the VM and have the same functionality in Cata. We can see that we are running here again, our GPU, we can ask that if what compute mode is enabled, it's on, we can gather the CPU, which is an AMD SCP system, we can get the GPU, which is also CC capable, and we can ask it about the status, if it's ready. The reason for this is that we want to know if the configuration that we are running it, the VBIOS, the driver, and all the other things are in a state what we are expecting, and the VM will do internally at the station. It will check the measured values that we're getting from the driver against a set of gold measurements, what we expect, and then we will get a report out that we can take a look in here, NVIDIA verifier, shoot. I think I took the wrong container, something going wrong. But anyway, I just wanted to show a file here, showing that the attestation result was successful. But that this attestation was successful, we can see that the GPU is set in ready state. If attestation fails, if the VBIOS or the driver are in a wrong version, the VM will internally set the GPU to not ready, and the container will come up, but all the CUDA applications that you want to start on the GPU will just fail, because the GPU is set in not ready state. And from the container, there's no way to manipulate the GPU and set it in a state to a working state. It's all done on the VM level, completely isolated from the user, attestation is done in the VM, nothing is really running in the container. It should be completely transparent to the user without knowing the internal details if the GPU is set ready or not. This was a small demo just to showcase the lift and shift character that we implemented with the GPUs. So take your container, run it on Docker, run it with NerdCTL, run it with Portman, doesn't care, take the same container, run it on Cata, or run it as a confidential container. Awesome. Cool. Thanks for the demo. And that was very brave, I will say. So don't worry about anything that went wrong. It went like 99% smooth because live demos, there's always something that explodes. So you did a great job. Thank you so much. Thank you. All right. Awesome. So now we've got five minutes for Q&A. So all right, first question. And I'm curious about this one myself as like a piehead. So if somebody didn't want to spend a whole lot of money, but they wanted to put their hands on Cata and Coco, would it be possible to run either slash both on a small device like a Raspberry Pi? Yeah, you can run Cata on any ARM system right now. So we added a lot of support and especially we added all the GPU support on ARM platforms. So you can run Cata on a, is it a G5D metal instance in AWS, which is an ARM system with a GPU. And we are also enabled Cata on the NVIDIA ARM systems hardware that we are providing. You may have heard about the Grace hopper chip, the super chip for HPC. So we are running already Cata on the Grace hopper chip and as well on the BF3, which is the DPU, the data processing unit, which is mainly responsible to provide infrastructure as a service. And you're running firewalls and stuff like that on it. And you may also want to isolate those services that are running in your infrastructure against each other. So yeah, we are also enabled like Cata on the BF3 and we should have proper and good support for all things related to ARM and GPUs. I am not aware of any system where Cata would not run today. That's great news here. On ARM and GPU, yeah. Okay, awesome. Cool. Thanks for that. And there is a question coming in that is related. So on using GPUs, can it be split so multiple users can use parts of the GPU or is it one-to-one? And that came from Daniel. Thanks, Daniel. If you're doing a single GPU pass through, it's one-to-one, right? It's a one-user because one container. If you're doing VGPU, you can split up the GPU in multiple chunks and each chunks can be allocated to a Cata container or VM. So it's one-to-one. And depending on the resource capability of the GPU, you can have a lot of profiles running or a lot of chunks running on one single GPU. So it's a one-to-one relation. Awesome. So thank you. Thank you so much, Daniel. Another question came in. Sweet. So this one is, is there a support for live migration of Cata containers and confidential computing containers? So technically, yes, because AMD SMP supports live migration for confidential containers. Intel TTX has also live migration capabilities coming up as well. So it should work, yes. But the only thing that we need... So the only thing that I would make a point is that GPUs currently do not have checkpointing support or, well, they have some checkpointing support, but it's not integrated into Cata and confidential containers so that we can do checkpointing of GPUs, meaning that we can create a snapshot of the GPU and restore the state inside another VM. So with every X-ray that we are adding into the VM, we need to make sure that every hardware supports checkpointing. Otherwise, you can restore like the CPU and memory from AVM, but you may lose, let's say, the GPU and GPU state or NIC state. So all the components that are in the VM need to support it, as well as the trusted execution environments. Awesome. Thanks for that. Sweet. Thank you, LinkedIn user for sending such a great question. So I've got a question. So I'm just curious to know... I do not want to try it out myself, but I'm just curious. Can Cata containers and confidential compute containers be run in an air-gapped environment? Yes. There shouldn't be any problem to that because you can apply all those air-gapped mechanisms that you would use for your containers like local image caching or you're creating a local repository to store your images on this air-gapped environment. So you can use all of those technologies to create a air-gapped environment for Cata or for confidential containers. Since it's Kubernetes-compatible, all those technologies would just work. Nice. Awesome. Thanks for that. And we are at the hour. So this has been an amazing conversation, but I don't want to eat up all of your evening because I know that it's probably late over there for you and it's early over here for me. But thank you so much for joining us. This has been so much fun. Thank you. Thank you, too. I appreciate it. Absolutely. And we appreciate your time also and the audience. We appreciate you for tuning in. I want to give a special, special thank you to Zvonko for joining us today. And thanks to our audience. I'm just going to gush over you all again for tuning in. And please join us on December 7th for the next Open Infra-Live, where we'll get a recap of the virtual project teams gathering, PTG, that was just held back in October. And don't forget, if you have an idea for a future episode, we want to hear from you. So send your ideas over to ideas.openinfra-live.live And maybe we'll see you on a future show. Who knows? Thank you so much again, Zvonko, for coming. And thank you, audience, once again for tuning in. Hope we see you on December 7th. And I hope that you have a happy holiday. Thank you all so much. Bye, everybody. Bye-bye. Take care. Bye.