 Welcome back to theCUBE's coverage of NVIDIA's GTC24. I'm John Furrier with theCUBE. Dave Vellante and I are here on the ground with our analyst team and theCUBE and of course SiliconANGLE team coverage of NVIDIA GTC. This is an amazing event. As AI kind of goes mainstream in both enterprise and cloud infrastructure and distributed computing. It's a systems revolution. It's computer science, tailwind of epic proportions. We've got a great guest here. CUBE alumni, Priyank Patel with Cloudera, Vice President of Product and ML and AI at Cloudera. No stranger to big data. CUBE alumni, great to have you on. Thanks for coming on. Thank you for having me, John. So we were talking before we came on camera about the big data. And this is when the CUBE started 14 years ago and in the Cloudera office, right when they got their Series B funding. And I'll never forget. Amar Awadallah at that time and the founders. They didn't know Hadoop was going to be big and they thought maybe we should get some people in the industry to create a little ecosystem. Because then he was going to get an ecosystem and then boom, big data went big, a lot of hype. And then everyone realized that this is a game changer. Okay, fast forward to today. Now we have cloud scale, next generation cloud happening. You had cloud 1.0 with Amazon web services. Now you got cloud 2.0 kind of happening with kind of next gen cloud companies like Snowflake, Cloudera, building on top of clouds. I mean, Snowflakes and Databricks are a great example because they don't even have a cloud. They're built on AWS. So they have an ecosystem, MongoDB, you guys have been successful. So cloud's enabling a lot of value, incomes AI, generative AI, not pre-programmed world that we used to live in, databases, go fetch a response, query, build a report, have a dashboard. I mean, it seems so boring now looking back. What do you think? What's your reaction on this? It's interesting you mentioned AI. With AI, one of the key things that governs AI is the data that you're going to feed it. Your AI is only as smart as the data you can give it. And if you think about it going back 10 years, Cloudera with data management, and we wrote the wave of the cloud, but we always had our roots starting in the data center, which is really where large amounts of data is still managed, even despite of there being a large amount of public cloud related data management. And that's really what Cloudera does today. We are a hybrid data platform that manages about 25 exabytes of data in both public and private clouds. AI matters, why? Because bringing AI to where the data is is much, much, much easier than taking all that data and moving it to wherever the AI models are running. So we are excited because we have the opportunity with what NVIDIA is doing and with the rest of the ecosystem to bring that AI compute and the innovation directly to where the data is sitting, which is in our systems. It's interesting, I'm not going to compare NVIDIA to Cloudera, obviously NVIDIA stock price is going to be a trillion dollar valuation. But if you think about the legacy of NVIDIA, they're in ray tracing, doing some graphics, you had, you had Tensor composed into all the machines and then AI is their friend becomes their, that trend becomes their friend. And then I won't say pivoted, they just wrote on top of that. Cloudera has similar DNA and big data with Hadoop and as that's changed with the data platform, what is the bet that Cloudera is making right now? Because like NVIDIA have a trajectory of experience, what's the Cloudera bet from a product standpoint right now as you look at this next way? Because it's a great opportunity to learn from the experiences and not pivot, but like step up to the new opportunity. Absolutely, like we are definitely seeing a big opportunity with bringing the AI compute and that goes from tuning models to running rag applications to inferencing models to building the final applications where the data is managed. And we have a strong heritage as I said in managing large scale data for 14,000, 5,000 enterprises. So the largest customers in any industry, they are customers of Cloudera managing petabyte scale data, exabyte scale data. Our nearest competitor is a couple names that you just said earlier. We are a thousand times more data under management than one of the competitors you just said that was just public cloud, right? And so that's where the opportunity is is to if the explosion with AI and analytics and applications driven by AI can be brought closer to the data, that's an opportunity we see for our customers and win for our customers and win for us. Priyanka, open source has been a big discussion. Not much mentioned here at NVIDIA because they're kind of proprietary as someone might say to them, but I'm cool with proprietary starting and constrained in the market. NVIDIA's got a great opportunity. I won't say they're proprietary, but I guess I just did, but let's just go to the LLMs. If you look at the growth of LLMs right now, the proprietary ones, it's funny how they call them proprietary, OpenAI, Anthropic, those proprietary or foundation models, whatever they're called today, the growth of those things are great, but if you look at Lama and Mestrel for instance, they're catching up in both scope, size and adoption, almost, I think it hasn't crossed over yet, but it's pretty close to crossing over, so developers are actually using the open source models more than the proprietary models. What does that mean? Does that mean that it's a deliver frenzy? They're just playing around? Because our premise is the developers of the Canary in the coal mine and set the new de facto standards body because they vote with their code. What does that mean? Does that mean it's just a robust environment for developers or is that something more meaningful? No, 100% the trend behind having models that are openly accessible, starting with the Hugging Face community and all the models that you mentioned, that whether it's Lama, whether it's the Google Gemini model or Mestrel and others, all of those are the right entry points for developers to start playing around with these models because it turns out that using those models and tuning them to very specific data to very specific tasks is actually higher performance than a general purpose model that sits behind OpenAI. The OpenAI model knows the world but it doesn't know anything about my business and my business is in my data. If I can take one of the open models you mentioned, combine that with my data, now I have my AI which is really a differentiator that I would care if I was an enterprise trying to build something that was durable for the future instead of mortgaging my future with a closed model which I can move fast on but I can't own it in the future. So one of the things that Jensen pointed out that something that we've been seeing on theCUBE and CUBE Research for about a year now, he kind of maybe most simplified it a little bit into two areas in the enterprise. Two big opportunities, one, app developers and they have this thing called a NIM NVIDIA inference microservices, basically an API. Okay, they call it the AI Foundry, whatever. Just put that there as app developers. AI enabling my applications, that's cool. Okay, check, I can see that happening and data will be a big part of that. The other area is enterprise IT platforms, quote, sitting on a goldmine of data exhaust. Remember that phrase, data exhaust spins into data gold. So he identified those two areas. App developers, modernizing and AI enabling applications and then two, taking advantage of either the tools or proprietary data in the enterprise that now becomes intellectual property. Do you agree, what's your reaction to that? I 100% agree on that and that's really why we are partnering and integrating NIMs directly inside Cloudera because if you go back to the developer, when you look at from the app developer's perspective and we often do that, the framework of all the software that's available around the accelerated compute and GPUs that NVIDIA puts out is a lot like assembly code. If you were a developer, I'd say right in assembly, you would make progress, but you would not be as powerful as that. What NIMs does is it actually pulls it up one layer so that I as a developer can actually think about my business logic, think about the data that I have, combine it with the techniques and the optimized models that come in NIMs and then I don't have to think about what I have to do to take the batch size and the parallelism and the data parallelism and the tensor parallelism of the models. So that, it's really the distinction between programming in assembly versus programming at a higher level language and that's what NIMs provides as a building block to combine with the rest of the ecosystem. Is this something you brought up at Sembler because just for the audience, computer science, assembly is low level machine language, registers code, dealing with memory directly, that's a systems programming concept. That's bare bones. I mean, hexadecimal is probably like more like, you know, rebooting the machine and looking at core dumps if you've been in computer science, you know, you don't want to do that, but the advancement in computer science really went, when the main thing went from Sembler to structured query language, or that's right, structured languages, higher level languages. And that was a huge productivity gain for programming. Okay, I got my teeth into that. I never did punch cards, not that old, but it brings back this mainframe concept that you have massive big iron available and now it's going to be available as an AI system. So the question is two things. What's the operating system going to look like? You don't have to answer that right away. That's a rhetorical question. We'll come back to that. And two, how do you develop on that? Because you have to develop a couple of things. You got to write code or interface with the APIs, whether it's NIM or an API management system or an LLM and then you got to figure out, okay, what's my data programming strategy? I need data in my language because unlike Sembler to structured language, just be mentioning, you got to have data available like at a very low latency, horizontally scalable like the cloud. That's not how data has been set up in the past. So the old generation or the current generation is not set up for AI. So how do you get the developers productive when you got to reset the data model and start coding into the new resource base called the AI system? Yeah, so there's a bunch of questions. I'll start out with the first one. There's a bunch of questions over there. I'll start out with the first one, which is how do you get the data accessible to the developers, right? What has happened? And this is where the cloud migration or the cloud wave from last decade really helps us. Why? Because over the last five or seven years, what has really taken root at enterprises is there is separation between compute and storage independent of the AI wave, the separation of compute and storage has happened. So whether it's in the public clouds, on Amazon, Azure, or Google, or on premise, the way people have architected the data platforms, cloud data included, is that you have a separate storage layer, which is directly accessible, and then you have different compute engines that are accessing them, right? To your point, how do you get the data best to the AI? These AI frameworks are coming up with direct access to the underlying data, and they have the way to actually ingest that. There are newer ways of storing data of vector stores being one example of those. And so that is an evolving area that is ripe for innovation, that's number one. And then as you move forward into the actual using of this data and combining it, what really becomes an important factor is how well can you serve it? Or how well can you actually run it? Almost anywhere, and anywhere is the keyword over there. Because if I can only run it on the main frame, arguably the application is not that good, or not that accessible. But if I can run it on your device in the future, or if I can design it such that when the device becomes powerful, I don't have to redesign my application, then I have power. Then I have built an application that is durable to the innovation that is undoubtedly going to happen. This year, as we found out yesterday, and in the years to come. That's a great concept. One of the things we've been talking about on theCUBE is, well, I've been talking about clustered systems. Mainframe is a monolithic machine, but we're living in a cloud era where it's distributed computing. Cloud era, cloud era, cloud era, cloud era. Same thing, play on words there. I can throw a workload at a resource on a distributed computing network. It's not one single machine hanging there serving everybody, like the mainframe was. Okay, you got mainframe power in a distributed computing paradigm. That's the magic here. And if you look at the NVLink switch, what they've done is they've created the NVLink, which brings GPUs together as one. Okay, that's a system. That's not a mainframe. So now, okay, now you've got mainframes everywhere, basically. So what's the workflow look like? Am I going to throw workloads to the appropriate resource, or can I just build a super mainframe-like environment on premise? What do you see there? How do I see that evolving? Because that's the operating model I'm trying to get at. It's like, where does the dots connect here? Yeah, so the dots, in my mind, and how we are building our products, there is going to be, you will need extreme high capacity when you are trying to train models from scratch, or what is called pre-training of models, right? But not everybody in the world needs to pre-train models. There's going to be a large amount of applications and use cases that will be served by starting from a model, tuning it, and adapting it to your use case. The amount of compute required for that secondary use case, for the second half of what I described, is 100 to 1,000 X lower than pre-training the models. And that's where the dichotomy exists. You will need DGX Clouds, which is another, which is what NVIDIA provides, a cloud of thousands and tens of thousands of GPU-powered machines that you need to go there to the supercomputer equivalent. But when you're done training the model, then you can bring it into these piecemeal, eight-card machines or other parts that you can actually run in your data center or rent it out from a regular cloud provider. Both of which are now the piecemeal options that you can do to tune and run your applications. So I think that's the operating model of where AI models are being built. Well, take me through what the customer is going through right now. Obviously, they're transforming in real-time like you guys are and as the industry is. It's a whole another wave. I love this wave. I'm bullish. I've been waiting for this kind of revolution. I've been saying it in the queue for almost eight years now that a tech hippie-like revolution for techies is going to be here where new systems going to emerge. Kind of like the 60s. And I called last summer the summer of AI Love. Look at San Francisco. It's been a boom town in San Francisco area, Bay Area and Silicon Valley for the past year. So you're talking about the smartest people in computer science and specifically systems. You mentioned operating systems and assembler. I mean, obviously you must be a systems guy. So systems thinking is now the new cool thing. I mean, I got to say it. If you asked me 10 years ago, would systems thinking be the cool thing? I would have said, let's say with iterate, design thinking, UX design, that's kind of cooler. But systems thinking is the new hot thing. That's right. It's so funny you said some of the concepts that we find over here are very fundamental in nature because it is redefining how you're doing application building. When you try to build an AI application, it's different than building the three layer application stack that we've been used to from the web era. And I think that kind of thinking itself it's going to be both at the level of systems as well as the level of the developer community where there is massive amounts of people getting trained on it. And as the newer generation of developers get, learn this firsthand, you know, in 20 years there will be developers who learn AI in college, right? And I think the neuro tech geek IQ has certainly grown in mainstream. The Jensen keynote I was talking to Dave in our opening analyst segment, now let's sign the keynote, is that you would have thought that Steve Jobs was in an iPod or something but it was like an HPC keynote. They're talking about deep silicon tech and people are cheering going crazy like they just saw the iPhone. So it was really amazing to see how mainstream, in fact Jensen told me privately that he took the telecom slide off about the 6G research center where they're using AI to actually manage energy on their virtual memos in Omniverse. I'm like, that is so freaking cool. He goes, yeah, I had to kill it because I didn't want to bore the crowd. And he's actually going to make a telephone call in the Omniverse on stage. Now to him, that's like total neuro tech. He had to cut it because he didn't want to scare the crowd. But I think he might have, I think it would have worked. But I mean, that's the level of savvy we're getting in the educated tech audience right now. It's a whole other level. It actually captures the attention and the fancy of the lay person even outside the tech world, right? There's been very few times when my family, my family knows I'm in tech, but they never ask me what I'm doing. The very few times that I actually asked with chat GPD when it came around, when applications like that become mainstream, it really captures the fancy, right? And definitely that's what excites everybody. Yeah, well I got to tell you, I said that all of us inside the ropes in the industry love, have been loving machine learning, machine learning ops, IT ops, AI ops, but the generative AI movement has been a gift for educating everyone on AI to your point, where it was like a moment like the web, remind me of when the World Wide Web showed the first browser, it was like, okay, I see the world is completely different. I can look past the embryonic stages of where it's at and see how people would be using this self-service, discovering information, search engines, multimedia, all that played out. Again, there was a dot cop bubble burst, but now we're in an AI bubble. What's your final takeaway here just to kind of share with the folks like this revolution, this new way to do software, the new way to compute, with cloud kind of not going away either. You got cloud-scaled, public, and that sort of computing. How do you explain all this to the average person? I think some of the basics are not going to change. And we started out with this. You started, the first question was why, why is Cloud AI excited about this? We are excited because it is true that your AI is only as useful as your data is. Ask organizations, the ones where we work with the largest companies in the world how many of them have their data assets managed in a way that they can pull up that data whenever they want it with anybody in the organization? Where if you have actually received that level of maturity despite of all the innovation that has happened in the last decade and more, right? And then those fundamentals are not going to change. In fact, they will be extremely important if you are actually going to make the leaps and bounds that the AI technology and the AI innovation is actually promising, right? And that's really what gets Cloud AI excited about is that we do see that as an opportunity for us to be able to power these applications for our customers. You may have to call it Cloud AI era. Okay, that's a new company name. Yeah. Yeah. Great, thanks for coming on theCUBE. Appreciate it. Great to see you again. And congratulations, head of product over there for AI and machine learning at Cloud AI. I'm John Furrier, the host of theCUBE. We are here in the GTC where you're going to see a conference for the era of AI is here. That's the tagline of the show, NVIDIA GTC. We'll be back. Thanks for watching.