 Welcome back everyone to SuperCloud Six, the AI innovators segment. We're going to be doing this again on the 19th this month as an addendum. We're probably doing a lot more. There's a lot of AI innovation out there. The world is changing. Obviously the infrastructure, the software layers, abstractions, and of course the applications are all being refactored, reset. And our next guest is the CEO of Neo4j, Chandra Rangan, who's here, former Google, but knows a little bit about what's going on in the cloud and also with graph databases and a lot more Neo4j is doing great. We just had you guys on theCUBE recently. Thank you, John, for having me here. So one of the things we're excited about is that we've been chronicled in the whole cloud revolution where he went from there is not one database that's going to rule the world and Andy Jassy saying, hey, everyone's going to move their data center to the cloud and then he walked that back. But now I think it's pretty much well understood that cloud operations is going to be the normal. So you get public cloud, you get on-premise edge and super edge, intelligent edge, it's a device or whatever. This is the new distributed computing paradigm and this is what's going on. Now you add generative AI to the mix, it makes the data more valuable. So databases become pletiful. You got graph databases, you got time series databases, you got structured database, you got unstructured. All this has to work together. You guys are in the middle of it Neo4j. How do you see the market evolving with generative AI? Because this is a key part. You guys are part of that. The knowledge graphs hot, what's your take? Yeah, so generative AI has made a huge, huge difference and one of the differences it has made is that I don't need to build my own models anymore. I can use a general purpose model and then use that to build an application around it. And so the conversation then shifts from, hey, how do I build the best model? And that's an important conversation. It's happening with a few vendors like OpenAI and the Googles and the Amazons and so on. But for enterprises, they are looking at it and saying, hey, maybe I don't need to build my own model. Why don't I use a general purpose model and do something with it to create an application? The moment me as an enterprise, if I try to do that, I get into trouble. So the first type of trouble I get into is, I've got a lot of data that I want as part of the model so that it can be queried, but I don't wanna share that necessarily in the public domain with a general purpose model that I can't get back. So how do you access proprietary data and keep it proprietary and have controls around it while still using a GenAI model? Then that's where databases and the RAG architecture is a super critical part of it. It's just, you said the word proprietary. It's interesting because we were originally calling the OpenAI models proprietary. Open source started to emerge on the LLM side. You saw Lama and then, but now, but it's really, they've been called foundation models. They've been kind of renamed. But your point about proprietary data and those models are important because that's the IP of the company. Exactly. The old expression during the big data days 10 years ago was data is the new oil and now it's the new gold or take that data exhaust and turn it into gold. That's kind of what's happening. So companies are realizing their data is valuable. I don't want to expose it to the LLMs what you were just mentioning, but also then it's like, okay, how can I use the data either to reduce costs? And now what's interesting about GenAI, there's another dimension of revenue generation because the use of data that wasn't being monetized is being monetized for top line revenue. And as well as cost reduction. So the question is, okay, what is that equal? How much does that cost to get there? Take us through some of the challenges that you see there because it's a challenge and an opportunity because if you over rotate on the cost side, you spent too much. Yeah, so data becomes important in two related but separate ways. The first is I have my data that I want to hold as my own and it's my IP and I want to monetize that IP. And so should I and can I give that away for public good? But it's not really public good because somebody else is making money off of it. So that kind of doesn't make sense. So just holding on to your own data but still driving insights and driving value from it is one part of the equation. The second part of the equation that's related but it's a little bit different which is just because I have data doesn't mean it's useful. I have to structure it in a way that it becomes useful. And how is data useful today? Data is useful because I can actually see patterns and I can understand the context of the questions I ask. And that's why knowledge graphs now become a big part of the GEN AI stack. So talk about the power of knowledge graphs because this is something that is becoming more important. We're seeing it more from the value creation and the value capture side because knowledge graphs can be a state of content, context and when you integrate that into other enriched data like data sets you can get behavioral data. So you can get context and personalization. So it seems to be a perfect fit for GEN AI. Can you explain what's going on with knowledge graphs? Oh, 100% and we are seeing so much interest in Neo4j for knowledge graphs because of GEN AI applications. And the way that translates is if I need to build a GEN AI application then I need a knowledge graph to structure my content and my data so that my GEN AI application can actually work really well. Now I'll give you an example. So for example, we can have, most companies they start with vector stores. Vector stores really support fundamentally similarity search. Hey, show me all the things that look like this. But the answer that I get does not include the context. So that's different from saying show me all the things that look like this that is related to John. So once you start adding these conditions and asking for context, then you need a knowledge graph. And so we, just four or five months back, we released vector capabilities in Neo4j's graph database and now you can have vector search but you can actually have the vector embeddings associated with the nodes in the graph or the relationships in the graph and suddenly you have a much more richer context. So that's kind of where we see knowledge graphs heading forward. Well, I mean, computer science theory, look at the timeline of compiler design. It's all graph-based. So you got nodes and arcs that you label them. And as you get more efficient, what I like about the vector stuff that's, I mean, it's not new technology but it is a new approach to indexing. It's not keyword-based or OCR for images. It's math. And so that works well for retrieval when this augmentation generation or reg as we've been calling it. Why is that important? Because now, and let me take a stretch here and say that if I have math that gives me similarities and data, I can look at nearest neighbor and all the theory involved and all that good stuff. I can adjust the graphs. The graphs can be dynamic. This is powerful because now you can organize that neural connections in a graph format. Explain why that's important. An example, a very simple example, right? That I can share with you. Suppose you want to look at companies that are sensitive to lithium prices, right? With everything that's happening with EVs. A simple similarity search will yield to maybe a list of companies who are maybe lithium mining companies, right? But that's not the answer, right? There are a lot of companies that are actually dependent on lithium because you have the battery providers and the battery providers are suppliers to the car companies. The car companies have distributors associated with them. And so you have this complex network of interdependencies. And so if you want to really understand, hey, if I have a lithium shortage, which companies are going to be impacted? It's not just the lithium mining companies. It's this entire network. To get that network, that's a knowledge graph problem. That's not a vector search problem. I always say the knowledge graph is a neural network table stakes because you think about your brain, left side, right side, and you got connections. It's essentially a graph of connections and nodes and that together. It's okay, now take that to the next level. The state-of-the-art GNI today is prompt answer. And then you prompt engineer, maybe some prompts under the covers. So you're prompting, you're responding. But reasoning is a whole nother area. So talk about the difference between efficiencies with graphs and neural nets around prompt answer and then more reasoning because if I want to think deeper, I want to tap a graph that's going to give me access to better data. So that falls as a great use case for graph data. No, 100%. Because actually you hit the nail on the head when you said the brain thinks like a graph. Intuitively when we try to understand the world around us, we look for patterns. That's how memories are built and that's how we see the world around us, universe around us. In fact, any action movie that you see where somebody's hunting for something else, what you see, there's this big board and there are all these pins and there are all these red threads that are hung around and that's effectively a graph. That's how the brain thinks, that's how we reason. So when you try to do reasoning, what helps with reasoning? Patterns and patterns depending on context. So the brain is looking for those patterns and similarly, if you think about a Gen AI application, if it needs to look for patterns in data, especially if you want to keep that data private, then you need it to be in a graphical format. And they'll need pathways to be able to. And they'll need access to data. Yes, and they need access to data. There's a blockage. And you're not going to get that data in for reason. And that's where the RAG comes in, right? Because what RAG does is it's able to marry the power of the general purpose LLM models with internal data. Because now you can keep those two separate but you can actually use the LLM's capability to do reasoning on private data. You know, I remember when the web went from static to dynamic with databases and that was a great change over things scale beautifully. When you think about static and dynamic relative to data, you think about a mechanism and then some organism. So as AGI is coming around the corner, which I don't think it's going to anytime soon, but I'm still pro AGI, I don't get me wrong, but I think we're going to, a lot of get there, a lot of road to get there. But we're looking more like to be responsive like a human, like an organism, not like a mechanism. So being adaptive is going to be a key criteria to GNAI and then also managing what happens. So if you do a question and it's a good answer, do you store it somewhere? And who stores it? Does it cashed? Or do we lose the answer? Is there any observability? Where's the management? Where's the application? A lot of instrumentation will need to come around the pike. What's your view on this? You're 100% right. And again, that is a separation between having all that instrumentation for the LLM, creators or the model builders who are doing general purpose models, right? And so they will be able to solve all this in at scale. But the problem is most of the world's data is not necessarily available for everybody to see, right? The bulk of the data is actually held in private repositories, people's IPs. And so how do I take- And or governance or regulated industries or other things? Exactly, right? So now the problem that you talk about, which is stitching all this together, actually has to be done multiple times in each of these companies and each of these organizations. And that is a whole different set of skills that I think over the next year or two years that enterprises will have to focus on. So let me ask about Neo4j as a product roadmap because I can envision, you mentioned earlier, LLMs work with each other. You mentioned proprietary, we call them, they're small language models, but that's the IP. How does a company get and protect their models? Can I build a private knowledge graph and how do I measure that, how do I secure it? What are you guys seeing out there now relative to your customers and how they're evolving these to protect their IP? Because again, they want to expose it, but they got to govern it and protect it. And so building private knowledge graphs, the idea of that has been around for a long time. It's taken on a sense of urgency now with GenAI. And the interesting thing is, in a strange way, you can actually use LLMs to build your knowledge graphs because LLMs are very, very good at extracting information. So instead of trying to manually build it and try to update your knowledge graph with complex pipelines, you can actually take an LLM, run it on a bunch of data, both structured and unstructured, extract a knowledge graph out of it, which is maybe 70, 80% ready, and then you can tinker with it and make it better. So that way you can actually still hold on to your private data in a RAG architecture while being able to create a knowledge graph. I think the RAG is going to be a very important topic going forward. It already is now, it's one of the most talked about. And frankly, it's the easiest area to get into and innovate and see value. Now the question that everyone's talking about, and we are facing with our CUBE language model is, the smaller the better, because if you have a good language model, it does a good job, but it's not adequate relative to the other models. So I see a vision where models will interact with others and work together, passing parameters back and forth, saying, hey, I need your help. You're the best B2B CUBE language. I'm going to go to the CUBE for that, or I'm going to go to that data source. So data sets I would envision would be like, almost like a leaderboard, the best data wins. And then they'll broker that via API calls, other data, right? And so I think you've got these general models, large language models. I think you'd have vertical and domain specific models that can be actually much smaller with a smaller footprint, but then can interact with these LLMs. And then you would have, in some cases, can you have versions of these models that are for an organization themselves? And then can you chain all of these models to get the right results you want based on the application? I remember when we were doing the CUBE entity extraction years ago, Amazon had Alexa, and they couldn't figure out what Kubernetes was. So they call it Kubernetes. So I can envision a model where we've come to the CUBE and saying, is this right? We say, no, it's not Kubernetes, it's Kubernetes. Because that's a word that came out of nowhere, but that became lingua franca, in the linguistics. So as you get into these domain specific spaces, OpenAI and these guys aren't going to have the fidelity and the accuracy of what it actually means in their knowledge graph. It speaks its own thing. And the same word and the same acronym can mean very different things in different contexts. And that's exactly, I totally agree with you. All right, so what's the vision for Neo4j at this point? You guys have been well-known in the graph space. As the market grows massively, you guys are on a great growth curve with GenAI and multiple databases in the cloud and now they're contributing to computing. What's going on with the company? What's the vision? What's the hot traction for Neo4j? So over the last 10 years, graph databases has been the fastest growing by far, category of databases. And we see that sort of continuing into the future. And you think about our business itself. So we had traction with our traditional graph database business. Then we introduced a product called Graph Data Science which has had a lot of success over the last three years. And Graph Data Science is taking the same graph database and putting it in memory with built-in algorithms to support data scientists in model building. For example, you want to do centrality or you want to have entity resolution and you want to fill data that probably, you didn't get in your first time around. More recently, we have seen a big traction with a RAG-based applications. So we introduced vector support around four or five months back and that'll kind of continue. And the next, this coming year, we are looking at deeper integrations for the cloud service provider. So stay tuned at the Microsoft Fabric Conference because we'll have some announcements then. And you guys have some traction on query languages that's going to be approved. Can you share a little bit about that? Thanks for asking about that. The open source, because this is a pretty big deal. The lingua franca of database queries has been sequeled for a long, long time. And the ISO standards body has refused to look at other standards with this JSON or anything else because they're like, hey, it can all be solved with SQL. And then along comes graph databases which are so much more powerful and the queries can be vastly simplified and they can do types of pattern matching that SQL can do but really kind of struggles to. So the ISO is actually working and we are working with them to release a new standard. So the first new language standard from ISO after SQL called GQL. And that we should be seeing more news about that later this year. We are very excited. And they don't do that very often. It's not like they just go around saying. This is the first time they're doing it since SQL. Yeah, so it's a big deal. It's a big deal. So we'll keep an eye on that. A final question for you as people generally start to discover, you know, my favorite line from Andy Jaster over the years and speaking with him early days in the cloud was, when you're doing something that's compelling and you're going to be misunderstood for a while. So you got to be okay with that. I think graph databases outside of the inside the rope. So the industry has been misunderstood by folks. And when people ask me what I think of it, they say, hey, remember that on Facebook everyone says, what's the hive mind? Hey, hive mind. The collective intelligence of your friend network. That's like a graph. People will get intelligence out of that. So knowledge graphs are doing the same kind of thing. So I want to ask you, what is the key that people are seeing with graph database when they go, oh, I finally get it? As you guys go mainstream with Gen AI, what are, as you guys get discovered, what's the big aha when people say, I get it now? What's that? What's the connection? I'm glad you asked that because it's a, you know, I joined this company in Neo4j around two years back because I was just super excited about the potential of what graph databases can do. And I saw it as not just one more type of database. I saw it as something game changing for the database industry. And sort of the best articulation of that comes when you talk to prospects versus customers. When you talk to prospects, right, people who we want to sell graph to, but they have not used us, you know, the sale is always based on performance, right? Like understanding patterns or your joins are taking too much compute. That's what the benefit is. But once somebody becomes a customer, then suddenly the conversation changes. They go and say, wow, this is so much easier to understand. This is super intuitive. In fact, if you search for entity relationship diagrams, and you know what an ER diagram is, we all know that, you know, that's computer science, you know, database one-on-one, you build a ER diagram to map out your data model. And then what do you do? You're trained to take that and put it into tables. First normal, second normal, third normal, fourth normal. And that's restriction at that point. You're constraining everything. But that's how we are trained. And so this aha moment is when you realize, hey, my data model, which is the entity relationship diagram is actually a graph. And that should be my database to start with, right? And so the benefits and the aha moment is cognitive. And by the way, the scale, the growth you can add to it without retrofitting it is simple. Just plug more stuff in. Just plug more in. And the way we have constructed our graph databases, it's actually schema free. You know, we call it schema flexible because you can impose a schema on it, or you can leave it flexible, which is huge during development. And it's huge when your data's changing. And your point earlier about reuse, you don't have to rebuild things over and over again. Great leverage and great ROI out of the gate. Exactly, yeah. Thanks for coming on. Knowledge graphs are going to be a big part of it. Okay, we've got six for the AI innovators. I'm John Furrier with theCUBE with Dave Vellante. We'll be back with more live coverage after this short break.