 Welcome everyone to the NVIDIA Red Hat Partners in AI, ML and other GPU enabled deployments. My name is Andre Beausole, and I'm a senior principal partner manager with Red Hat. And co-presenting with me today is Duncan Poole, I'll let Duncan introduce himself and he will be hosting the first few slides and then I will take over. Thank you. Thank you, Andre. My name is Duncan Poole. I work for NVIDIA. In dog years I'd be dead now but I'm the director of partner alliances for NVIDIA and one of my principal partners is Red Hat. So the purpose of today's presentation is really to showcase all the different places in which NVIDIA and Red Hat are collaborating. Red Hat or NVIDIA is probably not so well known for activity and open source and yet if you think about it, it's a core requirement that we go to market together in many areas and so this is kind of opening the kimono a little bit and letting you have a feeling for what that's all about. Okay, I'm happy to, I'm going to try and go through the presentation at a relatively quick clip to leave time for questions near the end and if we do run out of time then we can also adjourn and find some more time afterwards. Okay, so with that said let's get on with the presentation. So the agenda is basically me to outline the goals of the partnership, talk about what NVIDIA brings to bear in its tools and ecosystem, talk a little bit about our container efforts and container strategy and the open source projects that we have going on that we support and work on with Red Hat and then Andre will pick up and he'll talk about some very specific components to this that Red Hat has been putting together. So that's how we divide up our time. So the vision is really to treat RELL as a first class citizen for GPU deployments. So GPU is getting used in hybrid clusters in HPC and AI and in many commercial environments now and it's important for us to go to market with RELL because it's a, it's a classic distribution for the commercial marketplace. For us that means among other things try to simplify the install process that we do because if you, like if you went back about five years and tried to install RELL you would probably chew through a few different versions before you figure out the magic incantation that lets you install it. Also in the day you used to do this using a .run file and now NVIDIA has built you know RPM repos for our, for our offerings to make it a little easier. So together we do for each card we release, for each OS of RELL, for each CUDA distribution we release we have to do sort of synchronized OS qualification. So you can imagine that this is quite the cadence of meetings. There's still a lot of room for improvement. Beyond that in the container space where you know you really want to wrap up an application and deploy it in a commercial environment it's been important to get more aligned with, with Red Hat on how we do that. So we'll talk about that. We've, we also have specifically in the, the open source collaboration space one that's going on around heterogeneous memory management. And I'd like to tell you a little bit about that because it's kind of changing the way malloc works so that it works well with devices that manage their own memory like NVIDIA has. And a few other open source projects. So what's new? So from NVIDIA what's new is if you were to go and try and download our driver and our CUDA you could go and use the RPM repo. We also have within our free CUDA downloads several new host compiler options. LLVM is directly supported. If you've never used CUDA and I'm going to guess has anyone used CUDA here? Okay. So CUDA is the compiler that lets you use a syntax to blow a parallel program out onto a GPU. And to give you a feeling for what I mean when I say parallel to keep a single voltacard busy probably takes 150,000 threads of computational activity. Which is scheduled by the hardware when you launch the kernel. And kernels can launch kernels. And in the new environment where we have a network called NVLink joining them all together you can have 16 of these launching activity, parallel activity out. So the programming model for that is called CUDA. And we provide a bunch of tools, debuggers, profiles to make that all work. So there's host options because obviously it's not all going around in the GPU. And some people might want Intel, some people might want GCC or you might want LLVM which is a more recent thing. So you get all these options. So to go on we're going to talk about Kubernetes on GPUs. And basically the trick here is in a containerized environment you want to have, you want to teach Kubernetes how many GPUs you have. Are they busy or not? Are they near a CPU? So CPU affinity becomes important. And all of that is a contribution that NVIDIA made called NVIDIA plug-in for Kubernetes. Then we're going to have Andre to talk about the KVM improvements that we've been doing in grid together. And a little bit more about open source collaboration. So if you want to look at the breadth of what is in NVIDIA in this market space, we have examples here of some very high performance computational chemistry, very long running apps that are well tuned for running on GPUs. And these ones are all now containerized. So you can go out and you can launch those apps with a container. So you don't have to think so hard about compiling it up. What are the library dependencies and so on to make it run. We also have intimate relationships with all the various frameworks developers. And if you're interested in detail on this, we have an AI learning environment online. So you can go off and you can pick up a framework and run your own self-paced learning for this. So the frameworks are great and you can quickly become kind of an AI expert. But if you want to go down the path of traditional programming, we provide libraries that are already ported to using GPUs. So if you're familiar with LA-PAC or any of the standard math-heavy libraries, they're directly available there. So you don't actually have to, you just call them with your CPU code. You don't have to write any code at all. Or you can dig down and you can use one of a series of standards like Python or Thrust. And these ones are also all GPU aware. And then finally, we have our own specialized language, a free Fortran, free C compiler that can be used, works on the CPU, but it also works on the GPU. And you can blow your code out directly to it. So with Red Hat, on RHEL, we now provide a faster release of CUDA. And this in part is because we have RPM distribution model working now. So we're going to release four releases of CUDA per year. And then as I was saying, the math libraries, especially around AI, are moving even faster. So you can pick up the improvements that we have for each new GPU that comes out from NVIDIA directly every month. And because that sounds like a nightmare for a developer, this is where that whole containerization framework comes in handy. Because if you need to freeze it one version, you can and use that. Or you can follow us going forward as fast as we're going. And it's simple. You can see that basically you enable the repo and then do a YUM clean and finally a YUM install CUDA. And away you go. And this more broadly is designed to show you a picture of the ecosystem of tools that NVIDIA provides or provided in freeware. So you see memory checkers, visual debuggers, visual profilers, and a series of libraries of various useful kinds for AI and for math. And then finally your standard languages. On the Kubernetes alignment front, we recently had our own developer conference where Jensen Huang, our CEO, demonstrated a containerized app running on the show floor and then failing over to run on the Amazon Cloud. So the fault tolerant aspect of our containerization strategy is starting to come into play. It just shows the robustness feature here. And actually on NVIDIA, if you go register as a developer, you can download our pre-built containers that include all these various frameworks already tuned up to run on our devices. That's the whole point of it. And the benefits of containers are obviously a stable environment for install dependencies, not having to resolve between the developer and the user what they actually are running on. So it just simplifies the whole process. And as any one of us probably knows, getting the install dependencies right on Linux is sometimes a bit of a chasing your tail game. OK, so for open source development, and this is one more game I don't think NVIDIA is normally thought of, we work directly with Red Hat Maintainer. The guy's name is Ben Skaggs. So every single graphics card that we put out we give him one of the graphics cards and help him make sure that when we release, he's already qualified it with Nouveau. And we try to do that when we come out with new devices, but also when Red Hat ships new versions of their OS as well. In the parallel language space, there's a library called libgomp. Gomp stands for OpenMP or GNU OpenMP. And that library is actually a common runtime library for people that want to use directives. And that's basically a comment that goes around a block of code saying, run this in parallel. OpenMP is the classic example of this. I've been doing it for years. OpenACC is another one. And they're actually implementing OpenACC in GCC in Gomp. So there's this fun little cooperation that's going on between the OpenMP maintainers, the OpenACC maintainers, and Jakob Jelnick at Red Hat helping orchestrate all of this. For HMM, this memory manager that I was talking about, the key developer of this actually is at Red Hat. And he's been working with us since about 2012. So it's a very long-term project, just getting upstream now. And I'll talk about that more specifically in a slide or so. And then finally, VGPU and the support for multiple virtual GPUs for graphics and commercial workloads is another big effort. And Andre will talk about that. So what is heterogeneous memory management? That's the ability for malloc to work on memory that sits on a card and to be able to reference or dereference the pointer, whether it's on the card or on the host machine. And think of the underlying paging subsystem for Linux. If it's on the card and you're running as a process on the host, it'll just page fault it across. And you'll run against it on the host or the other way around. So if you think about it, half the problem with the GPU is resolving the data dependencies and where they live. And when you have this feature in malloc available, then it's a huge developer simplification. Now, you might sit there and say, yeah, but wouldn't you have thrashing going on as you reference these things on either side? That's true. But the truth of the matter is developers understand this. And in general, people build their working sets so they live in one place or the other. And all of the tests that we've been doing have suggested that by and large, it's a big win and only a small loss in most of these circumstances. Another feature that lives in here on the GPU side, it's not shown, but those P's, they can be connected via NVLink. And NVLink is this network, a very fast network of up to 16 GPUs. And all of this capability then runs on all of those. And each GPU can have 32 gigabytes of stacked memory. So memory consistency is all supported. Locking is all supported. And it just works. So this is a new feature that we're putting together right now. And with that, I'll pass it over to Andre. Thank you, Duncan. OK, we just got a couple more slides to go through. And I would like to talk a little bit more about the partnership that we have with Nvidia with regards to their grid. That's their VGPU offering with Red Hat virtualization. So a couple of things going on here. Back in, oh, I think it was 2015, we started working with Nvidia on providing the enablement for VGPUs. And that was quite a project. It required us promoting an upstream package, which was mediated devices. Mediated devices is the key enabler to allow you to set up VGPUs. In other words, take an Nvidia, Tesla, Maxwell GPU, and then deviate it up similar to what you would do for virtualization of CPUs. So we were able to get that upstream and accept it in 2016. And then afterwards, we worked on getting that support in Rev, Red Hat virtualization, as well as support added in Red Hat Enterprise Linux. So they were the upstream components, which Nvidia worked on. Actually, they did most of the open source development for mediated devices. We worked with them in testing it and then helping promote and getting it accepted upstream. OK, just to give you an idea of some of the open and closed source aspects of the VGPU, again, their grid support. If you notice, we're actually installing the grid software on the KVM host. And then with the mediated devices, we'll provide support for or we'll need to install the drivers on each of the VM gas. That's critical. Again, the VGPU will view the GPU as if it's a dedicated GPU. So there's no need to worry about additional management. It'll all be transparently handled by the MDF, by the mediated device enablement in Linux. OK, another aspect of our partnership is that last year, we were able to collaborate. We worked with Nvidia. We worked with HPE on a benchmark. This is the Stack A2 benchmark. Stack, for those who are not familiar, it's a security consortium where they tend to run very high CPU, high memory type benchmarks, which lends well to the HPC market. With our collaboration, we were able to have a configuration of Nvidia V100. That's their Volta GPUs. At the time, those were the fastest available GPUs. So we had eight VGPUs and an HPE-proliant server running RHEL Red Hat Enterprise Linux. We were able to break a number of records, both benchmarks around throughput, as well as energy efficiency. For those who are interested in the specifics of this benchmark, we have a couple of blogs that are available, and you can get more details with regards to the configuration. What this speaks to is, essentially, the partnership is leveraging two aspects. One is our ability to provide enterprise resilient support through RHEL. And then the other is to show how we have integrated with the GPU, Nvidia's GPU, as well as with CUDA. OK, something new that we just announced around the time of the Nvidia GTC conference just over a month ago was the availability of the GPU device plug-in. This is something that we worked with the Kubernetes resource management team to get the ability to provide support for managing GPU workloads in a containerized environment. This feature is supported in OpenShift 3.9, and the feature is available as a technical preview feature, which means that it's not production supported, but if you wanted to deploy the device plug-in in your lab environment, run a POC, you're able to do it. And we have specific information and guidelines on how to go about supporting that. So you can go to this blog. It'll tell you how to configure the GPU plug-in with OpenShift 3.9 and show you how to manage it. So a lot of detail is available here in this blog. OK, the other thing that is important to our mutual customers is to ensure that we are staying ahead of the security vulnerabilities. And of course, to this point, everyone here is probably familiar with Spectra Meltdown. It's something that we all had to deal with at the beginning of the year. And with NVIDIA, we were able to provide some of the patches to them so that they can test it and validate that we had mitigated any exposure that resulted from Spectra Meltdown. So that's a value of the partnership. It's something that, if you, as a customer, are running both Red Hat and NVIDIA technologies, you'll be able to have some level of assurance that we are working to ensure that we keep up with the most current CVs. All right, last and not least, I'd like to talk about our demo. We're going to be having NVIDIA in our booth at booth number 725. That's the IoT AI partner ecosystem booth. It's just on the other side. They'll be running demos. And we'll also have representatives from our product management team, as well as technical staff, to address any questions. Or if you want to have any side meetings, we'll be available to address them there as well. Well, I'd like to thank everyone for taking time out to hear about the NVIDIA and Red Hat partnership.