 Hope I got that right So welcome to our session about enable GPU virtualization and open-sac I'm Howard Huang. I'm from Huawei. I'm currently a standard engineer and open-source community manager And this is Lei Zhang Okay Yeah, my name is Lei Zhang. I'm a cloud software engineer at Intel I'm also a core of such a light project Below is Feng Shaohe and Chen Yingxin They are my colleagues at Intel. They contribute a lot In this great work, but unfortunately they cannot come so I'll Represent them to give a speech okay, so We too will Give this joint presentation on this topic So first of all motivation, right? Why are we talking about GPUs and open-sac? Well with GPU I I think We are using GPUs today in a Unfathomable way Back to the days like we only consider CPU as the main compute resource for example for for automatic driving for example, we have Tesla Google and Uber and many companies working on that and also we have like Joe Joe Hots working on his version of Comma AI I think it's a open-source self-driving. So to enable that Many solutions involves using the GPU to accelerate either the computer visioning for the car as well as the deep learning algorithms that That is used in the self-driving system and Another example is video streaming We are living in a age that video doing video streaming is everywhere, right? I Think other than the six second buying or some short videos we also We are now used to like streaming a movie or like Netflix TV shows online and GPU is vital for you to watch a high quality video online streaming, you know Tolerable way or should I speak? So the third example here is cloud gaming So I I think many young people now Enjoy the cloud gaming on either steam I think or Or other platforms. So nowadays it becomes more and more popular that Online gaming is hosted on a cloud infrastructure and GPU is also the key for that to happen. Okay so now We all know GPU is important and we all know we use it more and more in various industries So how about the GPU virtualization? Do we really need that? We believe that the virtualization of GPU will also become more and more important For example, if you have like VDI offered on the cloud or Media transcoding or Like the cloud gaming I just mentioned you need multiple tenant that actually sharing the GPU resources and We think that there are three aspects that we need to take into consider when we Talking about GPU virtualization so its performance we need direct acceleration to make sure the performance is okay and Capability so we need a constant for example, if we are streaming videos, we need a constant consistent visual experience and The last one is sharing So the multiple virtual machines on the cloud Should be able to share the GPU resources You know like multi tenants away Okay, and Last but not least why OpenSec, right? Since we are in OpenSec Summit. I I think this is the easiest one to understand so OpenSec just we just celebrated our 14th release new term and OpenSec has become a de facto IAAS Standard if you will and The community is just booming and I think if any of you is working within the community, you know It it is really a great experience to To to to be an OpenSec developer To discuss and share ideas within the project or cross-project And now for new term We have like more and more features enabled and we have a growing Official projects or core projects in OpenSec so OpenSec is will be the idea platform to have GPU visualization enabled to support the cloud computing Okay, so my colleague from Intel will take over this one Yeah, I will give an introduction about Intel GPU virtualization. Here is our overview There are currently three ways to do GPU virtualization The first one is called API forwarding the VMs and DirectX and OpenGL API The hyperrather adjusts forward this API to a virtual graphic driver and Then the graphic driver forward API to the physical GPU This is not very Influential way to do things because most of things is rely on the software layer and It cannot expose the full features of APIs to the VM but From the sharing point this can share GPU to many VMs The second way is called direct pass-through We can see that physical GPU is attached to a single VMs graphic driver So this is very good performance way to do things Because our VM is using the whole physical GPU and it can use all the features GPU provided but this is now called sharing because You can see my physical GPU is mapping to a single VM and the third way is what we are Working on is called full GPU virtualization a physical GPU can split into multiple virtual GPU drivers and each GPU is assigned to a VM and From this way it can get a Near native performance and it can use almost the full features of the GPU and It can share to some VMs One physical GPU can support up to 15 virtual GPU sorry and Here's the benefits of our technology from perform performance wheels We can get almost 80 percent More than 80 percent of 3d performance more than 70 percent of native 2d performance and the media decode more than 90 percent Media encode more than more than 80 percent of native performance We also support many features you can run native drivers inside a VM and below is Media API with support including direct acts open GL open CL Media SDK and the direct acts 12 and about the sharing as I said you can attach one physical GPU to Max 15 virtual GPU and each virtual GPU can attach to a VM So also we support the mainly gas OS like one to Windows 7 Windows 8 Windows 10 This is the status of our implementation of GPU dash G To use this technology you need to use at least from zone e5 v4 platform and We have done some work for down on the KVM Then GT KVM GT the features are already finished. We are going to up them upstream them soon Next here is what we are going to do in the future. I want to make these features more cloud friendly to Move it into open stack First one is we want to make support for low migration for water or GPO devices. So if VM is Going to do a lot of migration. It can ensure VM got same performance GPU after the migration and the QoS support in cloud environments we want to make sure that the Virtual GPU Don't affect it by other VMs Okay, thank you So teams from Intel and Huawei we are also looking into how to make sure that open stack will support the GPU virtualization features So naturally the first option we are considering is to have Nova to be able to support the GVM the KVM GT feature so I Think this is natural Method that comes into our mind basically we have LibWord calling into the VGPU and So treat it as a special PCIe device I think But there are some shortcomings so if you treat VGPU as PCIe device you could have like now and no one to manage it, but VGPU are not really PCIe devices it just Many of the features That we usually get from the PCIe devices might not Satisfy what we need from VGPU and Also, well, this is I think before Metaka You it is very complicated to add a new type of resource in Nova You have to consider a lot of things and to modify a lot of things and have the project and have the project team Okay with that but starting Metaka Nova has Has developed a new feature called the resource provider so basically make life easier for new resource types like VGPUs to be added to Nova and I think from Metaka The pro the resource provider work starts from Metaka and Newtown Also have the following blueprints implemented, but we believe still It is still in the early stage and We believe that this is very good direction from Nova Another option we are considering is that to have a dedicated OpenSAC service that actually dedicated to Man to manage these accelerators either GPU, IPGA, IPSI cards and so forth So this is the proposal that we We have a individual Or standalone OpenSAC acceleration service that Nova scheduler could talk to The acceleration service actually making the the scheduling decisions for the GPU resources and Then fill it back to Nova scheduler and then Nova scheduler Tells Nova compute like how to Provision the resources on the VGPU so this is another option we are looking at and With regarding this new acceleration service, we temporarily call that no man I will discuss it a little bit further later so in this way one of the advantage of this option is that the virtual GPU or like further IPGA could be treated as a Dynamic resources since it has this dedicated management you could Just have like different flavors for different scenarios for web service for VDI media transcoding and the configuration will be very flexible and you could like just configure the The most optimal scheduling algorithms for your GPU resources, okay and Regarding the No man project actually we Had a boss session back in awesome summit regarding this project So the normal the no man project actually born out of Requirements coming from the telco industry so when we When we were doing Etsy ETSI that's a telco sender organization when we doing ETSI sender regarding accelerators we find that many customers demand that that different kind of accelerators should be should be able to form a standalone resource pool and should be treated as Equal to the compute resource to storage and network resources so that's where the requirements are coming from and And so we we Establish this project from like ground up from zero this year and The goal for the well, we temporary Codename nomad before harsh recall guys be me up We'll change it later. Okay, so the goal of this new nomad module is to be able to provide a unified management framework for different kind of Accelerators for example as you can see from the slide crypto IPSAC which is important has important usages in FV and FBGA Intel and IBM and Most of the important vendors have FBGA solutions and GPU We just covered and also NVMe SSDs so many customers use it as a just dedicated Excerptions for certain for example latency sensitive services and also some intelligent Nick maybe from Broadcom or other vendors or others you plan and I know that when we start to talking about offering a unified like a unified interface for Certain set of resources it definitely will be difficult because those accelerators acts Very differently from each other. So what we want to achieve for nomad is to have a A higher level of abstraction that provide a common Common entrance if you like for the end users to be able to schedule the resources the separation resources and for their workloads and Vendors might be able to define their own like private solutions for Specific algorithms on how to best Schedule the resources so nomad just provide a common framework to to to enable that and We have been had sparse discussions within the project. So now when we believe that we are generally We'll be dealing with two types of acceleration resources one is that it's embedded or Like sit very close with the CPU So with that time I believe We're still counting on Nova to be able to provide a The core scheduling decisions and For the another the second scenario, we are trying to dealing with is the standalone for example, you have a remote connected FGA Pole or whatever a ray of GPU cars So that scenario will be a typical or ideal scenario for nomad to To provide this service, okay Okay, so future work as lay has had already illustrated So on the GPU side, and there will be further support on the word option and Citrix implementation already have a patch on that We're also looking forward to have a nomad implementation that would reflect on the GPU resources and a generic solutions for graphic virtualization and We have some resource links. For example, you could you could get the source code of KVM GT from here and also So the the other enhancements are coming up actually and also we have a Work session design summary work session for nomad project actually on Friday morning starting 1050 in room 130 so Feel free just you are more than welcome to participate in the session and you could Search session on schedule as well. We have a ether pad. So feel free to provide your input on the ether pad Okay, thank you very much questions Could you use sorry could you use the mic? They're waving at me Hi, so when you're sharing a GPU among multiple virtual machines Obviously you're time-slicing the CPU part of that but the other resource that scarce on a GPU is memory So do you time slice availability of memory or do you partition the memory between the virtual machines? So You mean How we participate to the memory right? Yeah, how do you how do you make the memory on the GPU available to the client VMs? so The VM is Turn into a virtualized GPU the physical GPU and the GPU is attached to a VM so It will leave to the hyperrider layer to handle this So it's like conventional memory is allocated It's it's split into sections and each VM gets a subsection of the memory. Yeah, okay. Thanks Why he's walking over I may ask my question Thank you. Thanks a lot for the talk I saw that the number of guest addresses or the type of guest addresses that you have is limited to Ubuntu and some Windows Types and what is actually limiting you there? So for instance, I'm specifically thinking about like sentos for instance Or well Is it the kernel version that they come with or? not because We are working Listed is what we have already supported but feature field in the future. We will Support it more guest OS Okay, do you plan in the future to support other? VGP vendor like Nvidia or AMD? Sure for from no mass Perspective we want to support all kinds of GPU and like from Nvidia or AMD or yeah Thank you Okay, if there's no further questions, then thank you very much for attending this session