 Welcome. So, my name is Dario Tranquitella. Forgot the surname, I know it's really hard. And I'm here to talk about, like in TOLG, five minutes, I'll give a try. And I'm the CTO and at Classics, Classics is a startup. We are based in Italy. We are a multi-tenancy con and we are the leaders of multi-tenancy, according to us, of course. We can talk about that. But yeah, we are the company that developed Capsule and Kamaji that are two tools. I don't know if you ever heard of that. They are addressing multi-tenancy, both at the namespace and also at the multi-cluster management. And of course, I don't want to talk about what's multi-tenancy because you already know what is. And rather, I would like to talk about multi-tenancy for the GPU instances. So, in this slide, you can see that essentially, multi-tenancy is about the resource control, better isolation, security, and governance, and cost control. And for the GPUs, we are going just to focus about resource control and cost control, of course. So, what is a GPU workload? It's pretty straightforward because in the end, it's just a regular pod running in your cluster. But of course, we are not consuming the CPU. Well, we are consuming the CPU, but rather we are connecting to the GPU, the graphic process unit. And essentially, you just need to assign a runtime plus and you have to specify some extra requirements about resources. So, keep in mind that also here, you can notice in the resource limits, you have the NVIDIA COM GPU. So, this is a special key in the resource requirements that you can put in your pods. And these are available and orchestrated by an operator. So, if you're using the NVIDIA's GPUs, you can take advantage of these ones just to be sure that your pod is going to consume just one or two or more GPUs. So, it's pretty straightforward, essentially. There are plenty of companies and users deploying GPU workloads and Kubernetes. So, in the end, you just need to do that. But of course, we are talking about multi-tenancy. So, with multi-tenancy means that we have plenty of resources, you can imagine, a lot of virtual servers, a lot of virtual machines, a lot of memory, and we have the workloads that are consuming these fleet of resources. And we have our GPUs. So, the question is, can we have multi-tenancy at the GPU level? Long story makes short, nope. I'm sorry to say that. But it's more complicated. I'll try to elaborate a bit more. And I'm sorry because it's still something new for me, this, I was doing size-ravity stuff and now I'm developing multi-tenancy solutions. So, the GPU world is something new to me. So, I hope that I'm not saying something wrong. But essentially, with the GPUs, we cannot use the C groups. So, we can create namespaces, we can limit resources, but this is not possible with the multi-tenants, with the GPU instances. So, how can we do that? And, trust me, this is not a sales pitch. I'm not working for NVIDIA, but I'm just focusing on the NVIDIA's GPUs. So, essentially, you can assign the resources just to a single GPU. So, you have your part that is consuming all the resources from that GPU. Or, you can use the multi-incense GPU, also known as MIG, M-I-G. Keep in mind that there is also another option that is time slicing. So, essentially, it's like having the processes consuming the same GPUs and the operator, rather NVIDIA, the NVIDIA driver, is creating a time slice pretty similar to what you have with the CPUs. But that's really bad if you think about multi-tenancy because multi-tenancy can be addressed at the internal level. So, you have the company with several developers, your friend. But if you're doing multi-tenancy in a hostile environment, such as, I don't know, the hyperscalers, you don't want to share your memory with other tenants. So, if we are able to understand that, essentially, with multi-tenancy, thanks to the M-I-G, with the MIG, it's just a matter of deciding how much resources we would like to assign to specific tenants and which resources we would like to allocate. And, of course, these requirements can be addressed in several ways. And that's the reason why I'm presenting Capsule. I'm a bit biased because that was my first open-source project. Now, as being donated to CNCF, we have a kiosk on Thursday, if I recall correctly. So, say that we can meet there, we can talk about Capsule. And, essentially, it's a framework to create a multi-tenant environment. We have a lot of adopters and they're doing a lot of stuff. And they discover that they're doing also multi-tenancy at the GPU level. So, this is a brief overview of the architecture, but I'm not going to talk about the internals of Capsule. There are plenty of tutorials, videos, and, of course, you can reach out to me on the Kubernetes-like workspace to get more questions and answers. So, what I'm suggesting here, since we are working with some providers that would like to offer GPU as a service in a multi-tenant way, essentially, what you can do is to take full advantage of Capsule capabilities. And, as you can see here, with Capsule, you provide to the tenant owners the ability to create a namespace in a self-service way, at the same time also enforcing some policies. Of course, with Capsule, you are supporting a lot of stuff, but since we are just talking about multi-tenancy for the GPUs, you have to enforce the runtime classes. So, we support the exit match, the reject match, but, in this case, it's just a matter of selecting the GPUs according to the product. And then, of course, it's not just a matter of runtime classes, but what you can do is also to say, I want to consume how many GPU instances, so one, two, three, it's up to you. This is everything orchestrated by the NVIDIA operator, so it's just a matter of applying these annotations, well, not annotations, but these keys to resource quotas that Capsule is creating on your BIALF. So, as I said before, Capsule is totally open source. We are looking for maintainers, adopters, people that is doing something with Capsule. So, these are the GitHub repositories. As I said before, this is a CNCF sandbox project. We are looking also for the incubation. And since I'm talking here, this is not a sales pitch, of course. It's always open source. Since this customer, this provider, is looking for a solution to offer GPU as a service, we are creating a new project named Kate's GPU. And with Kate's GPU, essentially, we are providing a way to say, I want to install a virtual cubelet in my cluster, and this virtual cubelet is going to deploy my GPU workloads on a shared cluster that is multi-tenant aware. So, you can imagine maybe I have everything on AWS EKS or AKS or any other cloud provider, and I don't have GPUs because they are pretty expensive, I would say. So, what I can do is to achieve the GPU pooling thanks to the multi-tenancy, because of course, I can buy a GPU, but the problem is that how can I delocate the GPU? So, if we think about the pods, what we can achieve with multi-tenancy, thanks to Capsule, we can achieve a resource pooling, and we can obstruct all the complexity of managing GPU instances in a Kubernetes cluster. These are all the Slack workspace, the email if you want to figure out more information, and of course, you can recognize me from the t-shirt about the cluster passers, and thanks. That's all. For the Q&A, yeah. Thanks for your presentation. Is this working together with the new Kubernetes DRA, the Dynamic Resource Allocator, that enables, like, to have different GPU types on the same Kubernetes nodes and all those advanced capabilities? Is this working together with that, or is it alternative? Can you comment? Well, actually, I'm just working with the NVIDIA operator because I'm working with this provider that is NVIDIA partner, so I don't have any idea. I'm just working at the multi-tenancy level, so my duty is to say, these are the runtime classes that you can deploy, these are the limits that you have to enforce, and the developers, they don't need to know anything about the underlying requirements, so it's totally obstruct, I would say. So I'm sorry, I don't know how to answer your question. Thanks again.