 Live from San Diego, California, it's theCUBE. Covering KubeCon and CloudNativeCon, brought to you by Red Hat, the CloudNative Computing Foundation and its ecosystem partners. Welcome back to theCUBE here at KubeCon, CloudNativeCon 2019 in San Diego, California. I'm Stu Miniman. My co-host is John Trier. And first of all, happy to welcome back to the program Diane Mueller, who is the tech lead of CloudNative Technology. I'm sorry, I'm getting the wrong, it's director of community development at Red Hat, because Renault Gubai is the technical lead of CloudNative Technologies at NVIDIA. Getting to the end of day one, I've got three days, I got to make sure all these things out. You got to keep a little more Red Bull in the conversation. All right, well, there's definitely a lot of energy. Most people, we don't even need Red Bull here, because we are at day one, but Diane, we're going to start at day zero. Oh my, yeah. So, you know, you've got a good group of community of geeks when they're like, oh yeah, let me fly in a day early and do like a half day or full day of deep dives there. So the Red Hat team decided to bring everybody on a boat, I guess. Yeah, so the Open Ships Commons gathering for this KubeCon, we hosted on the Inspiration Hornblower. We had about 560 people on a boat. I promised them that it wouldn't leave the dock, but we still have a little bit of that weight going on every time one of the big military boats came by. So people were like a little by the end of the day, but from eight a.m. in the morning till eight p.m. in the evening, we just gathered, had some amazing deep dives. There was unbelievable conversations onstage, offstage, and we had a wonderful conversation with some of the new DevOps folks that have just come on board. That's a metaphor for navigation and KubeCon and for our event. Andrew Clay Schaefer, John Willis, the inevitable Chris Pinella who runs Open Innovation Labs, and Jay Bloom have all just formed the Global Transformation Office. I love that title, and they're going to be helping to preach the gospel of cultural DevOps and agile transformation from a Red Hat office from now going on. It was a wonderful conversation. I felt privileged to actually get to moderate it and then just amazing people coming forward and sharing their stories. It was a great session, Steve Dake, who's with IBM doing all the Istio stuff. I've never seen Istio done so well, deployment explained so well, and all of the content's going to be recorded and up on, and we streamed it live on Facebook, but I'm still reeling from the amount of information overload and I think that's the nice thing about doing a day zero event is that it's a smaller group of people, so we had 600 people register about, I think it was 560 something people show up and we got that facial recognition so that now when they're traveling through the hallways here with 12,000 other people, they go, oh, you were in the room, I met you there, and that's really the whole purpose for Commons events. Yeah, I tell you, this is definitely one of those shows that it doesn't take long where I say, hey, my brain is full, can I go home now? Renault, I love your first impressions of KubeCon, did you get to go to the day zero event and what sort of things have you been seeing so far? So I've been mostly, I went to the Lightning Talks, which were amazing, I think, definitely. There are a number of shout outs to the GPU one, of course, coming from Nvidia, but I've definitely enjoyed, for example, the amazing DNS one, the one about operators, and generally, all of them were of very high quality. Yeah, so. Is this your first KubeCon? I've been there, I've been at KubeCon, this is my third KubeCon, I've been at KubeCon Europe in the past, then. You're an old hat, old hand at this. So before we get into the operator framework, and I love to dig into this, I just wanted to ask one more thought about the OpenShift Commons, the Commons in general, the relationship between OpenShift, the offering, and then the Commons and OKD, and then maybe the announcement about OKD.io and something like that. Yeah, so a couple of things happened yesterday. Yesterday we dropped OKD4, the alpha release, so anyone who wants to test that out and try it out, it's an all operators-based deployment of OpenShift, which is what OpenShift4 is. It's all a slightly new architectural deployment methodology based on the operator framework, and we've been working very diligently to populate operator hub.io, which is where all of the upstream projects that have operators, like the one that Renault has created for NVIDIA's GPUs, are being hosted so that anyone can deploy them, whether on OpenShift or any Kubernetes, so that that dropped, and yesterday we dropped and announced open sourcing Quay as projectquay.io, so there's a lot of IOs going on here, but projectquay.io is a fulfillment, really, of a commitment by Red Hat that whenever we do an acquisition and the poor Quay folks have been, they're acquired by CoreOS, and CoreOS are acquired by Red Hat and IBM there, and so in the interim, they've been diligently working away to make the code available as open source, and that hit last week, and to some really interesting end users that are coming up and now looking forward to having them to contribute to that project as well, but I think the operator framework really has been the big thing that we've been really hearing, getting a lot of uptake on. It's been the new pattern for deploying applications or services, and getting things beyond just a basic install of a service on OpenShift or any Kubernetes, and that's really where one of the exciting things yesterday, and we were talking, Renault and I were talking about this earlier, was that ExxonMobil sent a data scientist to the OpenShift Commons, Audrey Resnick, who gave this amazing presentation about Jupyter Hub, Jupyter Notebooks deploying them and how OpenShift and the advent of operators for things like GPUs is really helping them enable data scientists to do their work, because a lot of the stuff that data scientists do is almost disposable. They'll run an experiment, maybe they don't get the result they want, and then it just goes away, which is perfect for a Kubernetes workload, but there are other things you need, like GPUs, and the work that NVIDIA has been doing to enable that on OpenShift has been just really very helpful, and it was a great talk, but we were talking about it from the perspective, data scientists don't want to know anything about what's under the hood, they just want to run their experiments, so. Yeah, so Renault, let's understand how you got involved in the creation of the operator. So generally, if we take a step back and look a bit at what we're trying to do is, with AI, ML, and generally like Edge Infrastructure and 5G, we're seeing a lot of people there trying to build and run applications, whether it's in data center or at the Edge, and what we're trying to do here with this operator is to bring GPUs to enterprise Kubernetes, and this is what we're working with Red Hat, and this is where, for example, things like the Operator SDK helps us a lot, so what we've built is this NVIDIA GPU operator that's based on the Operator SDK, where it allows us to, in multiple phases, to, in the first phase, for example, install all the components that a data scientist or generally a GPU cluster might want to or need, whether it's the NVIDIA driver, the container runtime, the Kubernetes device plugin, the monitoring components. The phase two is, as you go on and build an infrastructure, you want to be able to have the automation that is here, and more importantly, the update part, so being able to update your different components. Phase three is generally being able to have a life cycle, so as you manage multiple machines, these are going to get into different states. Some of them are going to fail. Being able to get from these bad states to good states, how do you recover from them is super helpful, and then the last one is monitoring, which is being able to actually give insights to our users. So the Operator SDK has helped us a lot here just laying out these different steps, and in a way, it's done the same thing as what we're trying to do for our customers, the different data scientists, which is basically get out of our way and allow us to focus on core business value. So the Operator basically takes care of things that are pretty cool as an engineer. I love to do your election, but it doesn't really help me to focus on my core business value, how do I do the updates? Rino, can I step back one second, maybe go up a level? The problem here is that each physical machine has only a limited number of NVIDIA GPUs there, and you've got a bunch of containers that may be spawning on different machines, and so they have to figure out, do I have a GPU? Can I grab one? And if I'm using it, I assume I have to reserve it and other people can't use it and then I have to give it up. Is that the problem we're solving here? So this is a problem that we've worked with the communities community so that the whole resource management is something that is integrated almost as a first class citizen in Kubernetes. Being able to advertise the number of GPUs that are in your cluster and in use, and then being able to actually run or schedule these containers. The interesting components that we've also recently added are, for example, the monitoring, being able to see that a specific Jupyter notebook is using this much of GPU utilization. So these are super cool features that have been coming in the past two years in Kubernetes and which Red Hat has been super helpful at least in these discussions, pushing these different features forward so that we see better enterprise support. Yeah, I think the thing with operators and the operator lifecycle management part of it is really trying to get to day two. So there are lots of different methodologies, whether it's Ansible or Python or Java or that's Helm or anything else that can get you an install of a service or an application or something and instantiate it. And we support all of that with SDKs to help people. But what we're trying to do is bridge to the day two stuff. So to get people to autopilot. And there's a whole capacity maturity model that if you go to operatorhub.io, you can see different operators are at different stages of the game. So it's been interesting to work with people to see the aha moment when they realize, oh, I can do this and then I can walk away and then if that pod, that cluster dies, it'll just, I love the word automatically. But it's really the goal is to help alleviate the hands-on part of day two and then get more automation into the services and applications we deploy. Right, and when this is created, of course it works well with OpenShift, but it also works for any Kubernetes, correct? Operatorhub.io, everything in there runs on any Kubernetes. And that's really the goal is to be able to take stuff in a hybrid cloud model. You want to be able to run it anywhere you want. So we want people to be able to do it. Anywhere. Yeah, so this really should be an enabler for everything that its video has been doing to be fully cloud native, yes? I think completely. Our goal here is, this is a new tech, of course this is a bit comp, there's a lot of complexity, and this is where we're working towards is reducing that complexity and making sure that people that are data scientists, AI or machine learning engineers are able to focus on their core business value. You know, you watch all of the different services and the different things that the data scientists are using. They don't really want to know what's under the hood. They would like to just open up a Jupyter Hub notebook, have everything there they need, train their models, have them run, and then after they're done, they're done. And it goes away and hopefully they remember to turn off the GPUs and whatever it is, and they don't keep getting billed for it. But that's the real, the beauty of it, is that they don't have to worry so much anymore about that. We've got a whole nice life cycle with source to image or S2I, and they can just quickly build and deploy it. It's been, you know, it's near and dear to my heart, the machine learning, the AI side of stuff. It is one of the more interesting, it's the catchy thing, but the work was, but people are really doing it today, and it's been, we had two, three weeks ago in San Francisco, we had a whole OpenShift Commons gathering just on AI and ML, and it was amazing to hear, I think that's the most redeeming thing or most rewarding thing, rather, for people who are working on Kubernetes, is to have the folks who are doing workloads come and say, wow, this is what we're doing now, because we don't get to see that all the time, and it was pretty amazing, and it's been, you know, makes it all worthwhile, so. Diane, Renault, thank you so much for the updates. Congratulations on the launch of the operators, and look forward to hearing more in the future. All right, glad to be here. For John Troyer, I'm Stu Miniman. More coverage here from KubeCon, CloudNativeCon 2019. Thanks for watching The Cube.