 So my name is Kevin Cluz, I'm from NVIDIA and my colleague Alexi Fomenko here from Intel. We're going to be giving a joint talk on device plugins 2.0, how to build a driver for dynamic resource allocation, otherwise known as DRA. So what exactly is dynamic resource allocation? Well, it's a new way of requesting access to resources available in Kubernetes 126 and beyond. It provides an alternative to the count-based interface of, for example, asking for NVIDIA.com slash GPU2. And using this alternative interface, it puts full control of the API to request resources in the hands of third-party developers. So if you have simple devices, you can continue to use the existing device plugin interface. But for more complex devices, this new mechanism exists to give you a much more powerful interface. One can think of dynamic resource allocation as sort of a generalization of the persistent volume API for all types of resources, not just volumes. And the key concepts that you want to have in mind when you're thinking about dynamic resource allocation is that of the resource class and its associated class parameters, which help you define the API for resource classes, as well as resource claims and resource claim templates and their associated claim parameters, which I'll go into a little bit more detail as we go through the talk. So before I talk about those, I want to just go through a really quick example and show you, you know, demonstrate kind of how you would move from a user's perspective of requesting access to a device via the traditional device plugin API and what that same request would look like under DRA. So in traditional device plugin, if you were to ask for a single GPU under your limit section of your resources spec in your pod spec, you could ask for something like NVIDIA.com slash GPU1 and you would get access to that GPU at runtime. Under dynamic resource allocation, a similar allocation would be done using what you see here on the right. So the things to note here are that I have this object, this separate object from my pod called a resource claim template. Inside that resource claim template, I give reference to a specific resource class. In this case, the resource class name is called GPU.NVIDIA.com and this is something that's associated with the driver that I'm going to talk through today, how you can develop one of these. But this name gets associated with your driver, is installed by the cluster admin and is kind of analogous to the resource type that you have from the device plugins of NVIDIA.com slash GPU on the left. Once you have this resource claim template in place, you can now in your pod spec, there's a new section called resource claims where you can refer back to the name of that resource claim template, create a local name for your various containers within your pod to reference that and then put that underneath a new claim section in your resources spec for your container. Once you've done that, the driver under the hood will kick in to allocate a GPU for your container and when your container comes up, you'll have access to that GPU. If you then expanded this to where on the existing device plugin API, you were to request two GPUs. On the right, that would basically be referring to this claim template multiple times, having different local names for the GPUs that you want to access and then plugging those into the actual claim section of your container to get access to the GPUs that are represented by that. With that kind of simple example in place, obviously it's much more of a boast than you have with the existing device plugin API. The obvious question is, why would I want to do things this way? I hope by the end of this talk, you'll be convinced that this dynamic resource allocation way of getting access to these types of resources is much more powerful. The first concept that I talked about is this notion of a resource class. A resource class basically associates a named resource with its corresponding resource drivers. If you remember from the last slide, the resource class that I defined was called GPU.NVIDIA.COM and the driver that I would install on my machine to be able to allocate resource classes of type GPU.NVIDIA.COM, in this case would have the name GPU.Resource.NVIDIA.COM. Along with these resource classes though, which is something different than what you can do with Persistent Volumes API, is you can associate an optional class parameters object which has a completely custom API that's up to you as the resource driver developer to define what this looks like. In the case of NVIDIA GPUs, this is actually a real class parameters object that we have for the driver that we built for GPUs. You can set up something like saying that any GPUs that are allocated by this resource class cannot be shared. If you want to make sure that you have exclusive access to the GPU versus multiple people referencing this resource type and you don't want them to share access to it, if the admin decides to install this class parameter object along with his resource class, it'll limit that sharing capability. Moving on to resource claims, this is the user side analogous thing to the resource class where the resource claim represents the actual resource allocation to be made by a resource driver as defined by the end user. They create these objects, they refer to the resource classes that they want to allocate resources for, and then when they're referenced inside the pod, these resources get injected into it at runtime. Where the main difference between a resource claim template and a resource claim is the resource claim templates create a new resource claim on the fly each time that they are referenced. From the example I had before, the end result of this in the case of GPUs is that you get a unique GPU for each reference store on these resource claim templates. On the flip side, if you have a resource claim that's not a template, it always refers to the exact same object anytime you refer to it, which basically enables you to have shared access to a GPU for each reference. Just as with resource classes, resource claims can include an optional set of claim parameters with whatever custom API you've decided to define for your resource type. For NVIDIA GPUs, one of the claim parameters objects that we've created is called GPU claim parameters, where the example that I'm showing here is basically letting you say, okay, whenever someone asks for, or when I create a claim that wants to access a GPU, in this example, I can say, okay, that GPU has to either be of the product family T4 or a V100 with less than or equal to 16 gigabytes of memory on it. So it lets you selectively dive in and more precisely ask for the type of GPU that you want to get access to. Additionally, you can specify extended parameters such as what sharing strategy you might want to enable for the resource once it's been granted to you. So NVIDIA GPUs have a couple of different sharing strategies you can use. One is time slicing, one is MPS, which allows you to further subdivide the memory that you have amongst different clients that are sharing access to the GPU. You can specify all of this as part of this custom claim parameters object that we've defined as part of our API for accessing GPUs with DRA. So assuming that you had one of these claims created with the name shared GPU, I can reference that in my pod spec and then multiple containers within that pod can reference that exact same claim and get shared access to that underlying GPU. And it's not limited to within a pod, you can do the same thing across pod. As long as you've created this resource claim as a global object, you reference it. Now different containers from different pods can access that same GPU. Now, one thing I don't have on the slides, but it's probably worth mentioning is that we do resource claims are isolated to a specific namespace, so you can't share these GPUs across different namespaces. Once a resource claim exists in one namespace, the pods that are running have to be in that same namespace in order to access the resource claims that are created as part of it, as a security measure. Okay, so kind of walking through those simple examples, the goal of this talk is really to teach you how to write your own DRA resource driver in order to enable similar features on your own custom resources, right? It's great that we've, you know, taken the initiative, built this initial driver for NVIDIA GPUs, but I really want to, you know, enable, you know, third-party developers for whatever devices that you guys might want to make available to be able to do something similar. Yeah, so with that, the outline of the rest of the talk is that I'm basically going to walk through the anatomy of what one of these DRA resource drivers looks like and how they work under the hood. I'm then going to walk through the process of what it takes to actually allocate a resource using DRA, what happens behind the scenes once you create a resource claim, and the resource that you're asking for finally gets injected into your container and you have access to it. And then I'm going to walk through the process of some helper libraries that we have to teach you how to build one of these resource drivers and what functions and methods you need to implement in order to make your resources available in a similar way. Then I'm going to hand it over to Alexa who's going to talk about some of the new and upcoming features that we're building in relation to DRA and end with a demo both on NVIDIA GPUs and Intel GPUs to show the flexibility of this across different resource types. Okay, so what does one of these DRA resource drivers look like? Well, at its core, it basically consists of two separate but coordinating components. You have a centralized controller that's running somewhere in your cluster with high availability, and then you have a node local cooblet plugin that's running as a demon set on the nodes that where the resources themselves actually need to be advertised and eventually prepared for use. The centralized controller, its job is basically to coordinate with the Kubernetes scheduler to decide which nodes an incoming resource claim can actually be serviced on. Once it's made that decision, it performs the actual resource claim allocation after the schedulers pick the node where that pod should actually land to get access to that resource. And then once everything's done, it will perform the deallocation of the resource claim once it gets deleted at some point in the future. On the other side, the node local cooblet plugin, its job is to basically advertise any node local state that the centralized controller will need in order to make any allocation decisions at runtime. Also, once those allocations have been made, it will then perform any node local operations that are required as part of preparing or unpreparating the resource claim. And we'll go through a couple of examples later on of what this might entail, rather than just device might not just be already ready to go. You might have to set up some parameters on it, depending on what the actual resource claim coming in looks like. And then once it's done that, its job is to pass the devices associated with that prepared resource claim to the cooblet so that it can eventually forward them on to the underlying container runtime and make those resources available to the running container. So I talked about these two pieces obviously need to be able to communicate in some manner. The centralized controller makes allocation decisions, cooblet plugin tells it what resources are available so that it can make those allocation decisions. And so there's a couple of different ways that this communication could happen. For the purposes of this talk, I'm focusing mostly on this first one, which is kind of a single all purpose per node CRD, which can exist in your cluster, where the cooblet plugin will advertise all available resources that it has. The centralized controller will track any resources that it's allocated in the same CRD. And then the cooblet plugin will also track any resources it prepares inside the CRD. So there's for one CRD you can get the full view of everything that's associated with this driver in terms of what's available, how things have been allocated, and what's actually been set up on the node for someone to get access to. Alternatively, you don't have to do it this way. You could have some sort of split purpose communication where the cooblet plugin might still advertise it's available resources for CRD via CRD so that the centralized controller has an easy way to access these. But then the centralized controller might actually track all its allocated resources down through this field that we have in the resource claim itself called a resource handle. It might not need to be something that's stored in the CRD accessible via the API server for making things a bit more efficient. And then at the cooblet level, instead of using a CRD where you might have constant conflicts when you're writing back and forth to it across these different components, instead you can just checkpoint the state in a file on the node local file system so long as the cooblet plugin has a way to track this over time. But like I said, we're going to focus at least for the purpose of this talk on the single all purpose one just because it's easier to talk about if it's all in one place. And what this might actually look like under the hood is that the cooblet plugin, when it first comes online, it'll advertise some set of allocatable devices that the controller at some point in the future could allocate. When the centralized controller is triggered to do an allocation, you could pick one of those devices, write some information back to the CRD about what GPU has been allocated to a specific claim. And then once the cooblet plugin kicks in and prepares this claim for use, it can also write this back to the CRD saying, okay, you allocated that and now I have prepared it for use before it gets passed off to the container that gets started later on. So what does this process actually look like? So how do we actually allocate a resource using one of these drivers? Well, there's two modes of operation for this. One is called immediate and one is called delayed or wait for first consumer where the main difference between them is that with immediate allocation, and this is something you can specify in your resource claim as you create it as an object in the cluster. And with immediate allocation, as soon as you create this resource claim, that's going to trigger your resource driver to allocate it on some nodes somewhere in the cluster independent of what pods might actually try and come along later to access it. And the pods that do reference that claim will end up being restricted to the nodes where those allocations have been made. So it's a bit more restrictive in the sense that you don't know where these pods, what are the resource constraints they might have. So doing an immediate allocation is really a means to say, I know that anytime at some pod reference that I want it to be on this specific node, whereas the delayed allocation delays the allocation of the resource claim until the first pod that references it is being scheduled. And there's analogous things in the persistent volumes API. So for those of you that are familiar with that, this might not seem too foreign. And when you do the delayed style allocation, resource availability will be considered as part of that overall pod scheduling decision for the first one that accesses it. And we're going to focus, because it's the more complicated one, we're going to focus on the delayed one for, as I walk through the process of allocating a resource using a DRA driver. So assuming that this admin has come along, he's deployed his DRA resource driver, got a centralized controller set up, demon sets running with the KubeLit plugins. He's also installed some resource class that, you know, enables someone to come along and create a resource claim that references a specific resource type that that driver knows how to service. The user can then come along, create one of these claims, create a pod that references that resource class, create a pod that then references one of those resource claims, which is then going to trigger the Kubernetes scheduler to actually see that pod and start the scheduling process, right? When it does that, it's going to generate a list of potential nodes where it thinks that, you know, where it knows that it could potentially put that pod independent of the resource driver, giving it any information at this point about, you know, where this resource itself can actually be allocated. And when it does that, it creates a new API server object called a pod scheduling context, which the centralized controller is able to pick up and see this list of generated potential nodes and help modify that and, you know, trim down to the point where an actual scheduling decision can be made. So once the centralized controller picks up this list of generated potential nodes, it will narrow down that list of nodes to the ones where it knows it can allocate the resource associated with this resource claim. It'll write that narrowed down list back to the pod scheduling context, who will then be picked up by the Kubernetes scheduler to help, you know, figure out which node we should actually allocate this schedule this pod on. And this process could end up completing repeating over and over again until the actual node has been found where this resource is able to be allocated. But once that is found, the Kubernetes scheduler will select a node, write that back to the pod scheduling context, which will be picked up by the centralized controller, who then goes through the process of actually allocating the claim, knowing that this is the node where that resource needs to be made available. It writes that allocation back to the resource claim object that then gets picked up by the Kubernetes scheduler, who does the pod scheduling, writes the node name back to the pod where that pod has been scheduled, which then through the normal processes that already exist will be picked up by the Kubelet, who then will call into the Kubelet plugin associated with your driver, passing all the claim information associated with that resource claim to it, who will then generate a list of what we call CDI devices, which it can then pass on to the container runtime, who will then start your container with access to the resources that your driver has allocated. So it's kind of, there's a lot of back and forth going on, but most of this is hidden from you. And we, as I mentioned, we have helper libraries that abstract a lot of this out, so you don't have to worry about all this coordination back and forth across all these components. Now, you know, I said we're going to focus on delayed allocation, but it's worth looking just really quickly at what immediate allocation would look like. So assuming you had a user that came along, created a resource claim referencing a resource class, as I mentioned before, the minute you create this claim, that's going to trigger the centralized controller to make this allocation on some node in the cluster, so that at some point in the future, when the user comes along, creates a pod, that's going to get picked up by the Kubernetes scheduler, now he's going to use the information about where that resource claim has been allocated to make his scheduling decision to trigger that whole process on the right again. Okay, so that in a nutshell, it's a long complicated process in many ways, but you know, the rest of this talk is now dedicated to knowing that that background information about how all of this happens. How do you actually build one of these resource drivers yourself, right? And so this slide here, if you take nothing else away from this talk, and you guys want to build a resource driver, this is the slide to remember, because we've created an example resource driver that we've, you know, put as a repo underneath the Kubernetes SIG organization on GitHub, that has an example of all of this. It provides a fully functional DRR resource driver on a set of mock GPUs. It wraps everything in a Helm chart, so you can easily deploy it. It provides scripts for bringing up a kind cluster to test all of this in a multi-node setup, and it runs on Mac and Linux without requiring any specialized hardware, right? So you can clone this repo, run a couple of scripts, see the demo running, and then dig into the details yourself to kind of tinker around with it and figure out how everything works. And the readme itself includes a demo with four example deployments that you can see here that are mostly revolved around how do you enable some of these sharing capabilities, given that you have resource claims at your disposal now. And yeah, like I said, I encourage you to fork this project and play around with it yourself. But in a nutshell, what it takes to actually build your own DRR resource driver is you want to decide on a name for your driver. In our case, in the example so far, I've been talking about, you know, gpu.invidia.com, but in your case, you would come up with some name for the driver for the resource type that you're trying to advertise. You would then need to decide on a communication strategy, whether you're going to use a single-purpose CRD like I've been showing up until now, or some combination of the split-purpose communication, which was presented earlier as well. Then you're going to need to define some types to represent your allocatable resources, your allocated resources, and your prepared resources so they can be tracked on these three different levels. You also then need to define some types to represent your class parameters you want for your resources and any claim parameters that you want to define the API for accessing your resources. You should prepare at least one default resource class for distribution with your resource driver. So in the case of GPUs, we have that one resource class called gpu.invidia.com that you then access through the claim parameters object, which can help you dig into to get a GPU type that you actually want. There's then boilerplate code you can pull into your project to register a controller with the scheduler. There's boilerplate code you can pull in to register your plug-in with the cooblet. Then the fun part, you sit down and you write the business logic for your controller and your cooblet plug-in. That's what I'm going to focus on today because this is where the particulars of your specific driver come in, where you're going to need to implement the logic for how you make your resources available. We provide a couple of helper libraries to make this a little bit easier. The first one is the controller helper library. Its job is to abstract this whole communication between the scheduler and this coordination back and forth to where you have to figure out what node you want your allocation to land on. That all happens behind the scenes. We provide a driver interface with five functions that you need to implement. As long as you implement these five functions, behind the scenes, the rest of the library code will make sure that all of that talking to the scheduler and making sure that you eventually do the allocation at the proper point in the life cycle of the claim is abstracted behind all of these function calls. I'm just going to walk through each of these one by one. The first set are the get class parameters and get claim parameters functions. These are the easy ones to implement in many ways. Once you've defined your class parameters and your claim parameter object types, the only thing that these functions do is basically give you an API into or a hook into a point where you can say, okay, I see that for the specific class or claim, there's a claim parameters reference specified inside it. This gives me the opportunity to pull that reference, the actual object associated with that reference down from the API server and then return it through this interface object that's the return type from these functions. What that basically does is it makes these objects now available in all of the other calls without you having to repull that from the API server every time. The second one is the unsuitable nodes function. This function is the one that kind of sits behind the scenes during this whole loop with the scheduler back and forth about deciding which node you actually want a pod to be scheduled on to satisfy the allocation that you're trying to make. What you basically do in the body of this function is that you need to loop through the potential nodes that you're past in search of available resources on those nodes and then write back this narrowed down list of nodes where the resources are unavailable into this claim allocation struct that gets passed to you. You can imagine that there's a whole bunch of nodes, there's a whole bunch of claims associated with the pod you're trying to launch and you just need to write that logic to figure out where these resources can actually be made available for that pod. The next call is allocate. This is the call that gets invoked as part of the process that you see on the bottom here with delayed on the left and immediate on the right. Once the scheduler is actually selected a node and the centralized controller is triggered to make an allocation, this is the function that you implement to define the logic for how that allocation actually happens. You end up writing that information back to the CRD if you're using the centralized CRD approach for communication. Associated with that is the return type here of an allocation result which has a field inside of it called a resource handle which is just some set of opaque data attached to the claim that can be passed back to the cooblet for arbitrary interpretation. You can think of this as basically just a long string that Kubernetes doesn't know anything about and it's just a way for your controller to communicate with your cooblet plugin and he can interpret this data however he wants. Then obviously the last one is the deallocate call. Once the claim itself gets deleted, deallocate will be triggered and you can clean up anything that you've done from your allocate call. That was all on the controller side. Now on the cooblet plugin side, there's basically two components that you need to be aware of. One is a helper library which lets you set up registration and actually get your cooblet plugin talking to the cooblet to begin with. Then there is the cooblet plugin API which defines two calls, node prepare resource and node unprepared resource. Under the hood, they're fairly straightforward. When you get a node prepare resource request call coming in, you have all of the information associated with the claim that you're supposed to prepare the resource for and you do that, you take that information and at the end of the day you need to pass back a set of CDI devices. I keep throwing this term around CDI devices without really defining it. It's a bit out of scope to talk through what CDI devices are but it's a CNCF sponsored project for standardizing on how devices can be made available to containers and we leverage that in this DRA project and it becomes the foundational piece for how you actually make these resources available at the end of the day and I encourage you to look up CDI container device interface to learn more about this but it also is built into this example driver that I have so if you just go into the code and look at it, it should be pretty self-explanatory how it works from there. Then on the last call unprepared resource request, you get a similar struct passed to you with the claim information that you can use to undo any preparation that you've made via the prepare call. With that I'm going to hand it over to Alexei who's going to talk about some of the new and upcoming features with the DRA resource drivers in general. Thanks Kevin. Hi all. What you have just heard was available since the last release 126 and in the release that we have received last week 127 we have a new improvement of the dynamic resource allocation. In particular, it is possible now to create a custom resource drivers and controllers to allocate several kinds of devices or maybe similar kinds of devices but at the same time. In this case the communications or bookkeeping what resources and how many resources were allocated already have to be done somehow publicly so that both of the controllers that allocate the resources can access this data. Consider that the native or original resource driver allocates part of the resources then the custom resource controller has to know about that decision and other way around if the custom controller that is supplement supplement in the original resource driver controller makes the decision and allocates a bunch of hardware then the original controller has to know about that decision so that there is no overbooking or a false decision. And there are at least two use case for these scenarios one is when you have different kinds of hardware that somehow have to be for example aligned with the locality the normal locality or other way then the controller can consider these resources of different time for example a network interface and the GP adapter and make a decision that these are the hardware that have to be allocated and this information will be later on passed to different cubilette plugins. Another use case scenario is when you have a similar kind of devices and if the workload doesn't care what kind of or what vendor of the hardware it gets it's a vendor agnostic it can deal with any kind of hardware then there can be a custom controller that can manage all of the vendors devices in the cluster in this way. So in this case the allocation would have begun with the different resource class so naturally since we have a custom resource driver then we have a resource class that guides that the resource claim should be forwarded to this new controller custom controller for the allocation and when doing the allocation the custom resource driver controller will make several resource handle entries in the allocation result that is written to the resource claim and each of these resource handle resource handle entries will contain a driver name that will be used by the cubilette to ask the cubilette plugin to prepare the actual device when the pod is going to the node. Now another feature that will be coming hopefully in 128 is allowing several resource claims that will be used by the same pod and are targeted to the same resource driver all of them will be able to be allocated at the same time so at the moment the unsuitable nodes call the one that narrows down the amount of nodes the main scheduler can use to pick up where on which node the pod should be scheduled on this call receives all of the resource claims of particular resource driver at once so the resource driver that you are building can consider the whole amount of the resources that will be used by the pod but when the allocate call comes to the resource driver the resource claims are passed at the moment one by one so that's a different kind of consideration of resources by the resource driver and that can bring you to small problems but nasty problems for example if you have two pods that were deployed simultaneously and they have multiple resource claims unsuitable nodes will report to the main scheduler that the node A is suitable for both of these pods but when the actual allocate call sequence will start for every single resource claim at some point there might be a resource depletion and the scheduling will have to start all over again so the natural solution for this is to make the allocate call to look the same as the unsuitable nodes call so we plan into pass the whole bunch of resource claims for particular pod so that the allocation call will have a chance to see the whole amount of resources that need to be allocated for a particular pod and the same problem applies for the cubelet plugin especially if the hardware is capable of only handling one entry of the configuration then for example virtual functions on some of the SRIV capable hardware once several virtual functions were provisioned then the subsequent vfs can only be provisioned when the previously created ones are dismantled so if your pod has requested two standalone virtual functions then the unsuitable node call will report that okay this GPU on this node seems to be suitable and then allocation and preparation will actually face the situation that the first one has a chance to be created in the SRIV capable device but the other one cannot anymore be created because of the hardware limitations so similar way if we pass all of the resource claims at the same time to not prepare resource and not unprepared source then it's much easier to book keep and handle the resources by the resource driver so this is what it is now and this is how we think it will gonna be in 128 this is work ongoing thank you I think we have three minutes for questions yeah so the sorry the question was is the centralized controller a scheduler plugin no it's not it's its own it's its own entity I mean you could think of it as a scheduler plugin but it's not part of the scheduler plugin framework that exists in Kubernetes it exists as part of the DRA framework that knows how to communicate with the scheduler via this pod scheduling context object that that I talked about in the slides earlier there there is a scheduler plugin but it's associated with the DRA framework in general it's not something that you as the resource driver developer need to worry about implementing anything for yeah yeah and we had a couple of demos that we can show people that are interested after this talk but we we ran out of time so we're not going to be able to show them but if you're interested please come talk to us after this and we can go through those with you are there any other questions yeah um so if uh if I have like AI ASIC chips uh is it possible to do topology aware scheduling for example now with the DRA support or something we are looking at uh in the future yeah so um currently we haven't thought in detail about you know exactly how we're going to enable the whole topology aware or you know solve the topology aware problem um our kind of default answer at the moment is that if you want to do some sort of alignment you build a custom controller that knows how to do the alignment for all the different resources that you know you need to be aligned right and because we have this level of indirection where you can have you know the default drivers for all of your types and then you just write your custom controller without having to rewrite the kubelet plugin logic it enables you to do this for your custom environments where you know exactly which resources you want to align and can just write that allocation logic rather than all of the prepare and unprepared logic as well okay thank you in addition there is another technology just was just released called node resource interface that can help you align uh what you want but it's outside of the DRA framework and it's uh outside of the resource driver so it's something that resource driver can leverage but that's another couple of API calls and strictly speaking it's a best effort so if you're interested in this the main author and maintainer of DRA framework Patrick is going to be at Intel booth p13 and we have also people there who involved with the node resource management and the nri come talk to us one question do you have some recipe how how this can be it can work together with some legacy device plugins what do you mean by work or what do you mean by work together so at least our plan for NVIDIA GPUs with our device plugin is that we're going to allow you to on a node by node basis decide whether you want to deploy the existing device plugin or the DRA kubelet plugin basically um so you know there won't be a mix and match for being able to allocate but via both methods on on each node but at least throughout your cluster you can have some DRA enabled nodes some nodes that are enabled by the existing device plugin so that end users can slowly migrate their applications to using the the DRA style of asking for resources yeah and there's technical reasons why we can't really do them together on the same node mostly because the device the device manager within the kubelet it's the one that does device allocation by the device manager or the the the existing device plugin framework the the plugins themselves don't do allocation they just do advertisement of resources and so we can't coordinate that allocation between our resource driver and DRA and the existing device plugin allocation because we'd have to basically change the kubelet to do that but otherwise it's not exclusive it's not like if you're using a DRA dynamic resource allocation the rest of the devices of different kind cannot be used in the cluster they can live inside the same cluster and some of the hardware can be managed by the device plugins and some of them can be as long as they don't conflict so we need to patch our kubelet at least 128 to use dynamic resource allocation with NVIDIA GPUs or can we use it before with the new mechanism yeah I mean officially you shouldn't be using it until DRA reaches beta which is going to be at least another couple of releases but if you want to start prototyping and using it now 127 is enough so you can you know use 127 there's some feature flags that you'll need to enable when you start up your 127 cluster because DRA isn't on by default as an alpha feature but then you should be able to follow the instructions on our NVIDIA DRA resource driver for GPUs link that's here in order to start playing around with it thank you I think this will this will be the last question can this be used in conjunction with the CNI to for example allocate a vf to a pod from a dpu yeah so we have a separate project going on inside NVIDIA for doing exactly what you just said so it's a separate resource driver to our GPU resource driver but it's allocating vfs for our DMA on DPUs thank you yep thank you everyone