 From theCUBE Studios in Palo Alto in Boston, bringing you data-driven insights from theCUBE and ETR. This is Breaking Analysis with Dave Vellante. In March, we put forth our scenario about Uber for all, we called it, where a new breed of data apps was emerging. And we used Uber as an example, where data about people, places and things is brought together in a coherent real-time system. And the premise was that increasingly, organizations are going to want a digital representation of their business with incoming data and forming the state of that business and then enabling actions in real-time. This is Uber-like data apps for every company and the evolution of data apps for the masses that don't have 2,000 Uber software engineers at their disposal. Welcome to this special Breaking Analysis with CUBE analyst George Gilbert and friend of theCUBE, entrepreneur and executive, Bob Muglia, former Microsoft, former Snowflake CEO, investor and trend spotter. Welcome, Bob. George, good to see you guys. Thank you. Thanks. So, Bob, you just wrote a new book, The Datapreneurs. It's out, pretty detailed sort of history from your point of view. I particularly loved, I think it was chapter seven, a new type of operational database. We're going to get into some of that stuff today, but why did you write this book? Well, I thought there was something that I had to say and it'd be helpful for people to hear it, frankly. I mean, I was very fortunate to have a perspective of watching the data industry mature over really a 40 year period to go all the way back. And I've watched what's happened in this industry and how the people that have been involved have really made such a difference. And, you know, we sort of see the AI world we're in today and the incredible things you can do and people think it just sort of came about all at once. And in some sense, it certainly does, it seemed like it did that, but there's a long history that sits behind that. And I thought it'd be useful for people to hear that story and understand how we got to where we are and give it some idea where we're going in the future. Yeah, it's called, again, it's called the data printers. If you're into data and you kind of want to go deep from Bob's perspective, which is a unique one and really historical and intriguing one, definitely check that out. All right, let's hit the agenda today. Alex, if you bring it up here, we're going to look at what data apps look like today, where they're headed and how the underlying data platform is going to evolve to accommodate the future of data apps. So let's take a look at the next slide, if you will, Alex. We want to talk about the scope of today's data apps and the so-called modern data stack. You got ELT pipelines with things like DBT and 5Tran, lake houses, we're always talking about Databricks and Snowflake, they're two of our favorite examples, but you got BigQuery, we just came off of Google Next and it was very data-centric, obviously AI-centric. Azure Fabric now is this new interesting entrance into the market. Whatever we can make of AWS is bespoke set of data tools, but they're a player and then you got the BI layer on top of that. Bob, how would you describe today's modern data platforms in terms of the stack and the scope of applications that they support? You know, as you pointed out, David, there's really five platforms right now that different vendors are building that are all viable choices for people to build a modern data solution on top of. And I think the characteristics of all of them is that they all leverage the cloud for full scale. They can support data of multiple types and now we're beginning to see, in addition to structured and semi-structured data, video, audio files, all the things that people tend to call unstructured, which I think of as complex data, because there's clearly structure to that data, it's just complex compared to other data structures. And they're all working with these different types of data. And I think a lot of problems have been solved. You can finally corral your data in one place and make it available to your organization. You know, there's still quite a few issues that are unresolved. We see the need to build these data applications. People are wanting to incorporate AI into their business. These are the solutions that will be built in the future on top of these platforms. And the platforms are all well situated to be able to add those capabilities on and augment what they can do. There are also data problems that are unsolved and things that I've been focusing on really more than anything else is the unsolved problems in the data stack. And things like graph analysis are still somewhere between very difficult in some cases, maybe even impossible for customers to work with. And that's a lack of a technology and a set of tools. And of course, one of the issues that I know we're gonna get into and what is really becoming very apparent is that the modern data stack needs a semantic layer that sits on top of it in a sense to help do governance and understand what's in the data. And this is gonna become progressively more important as these large language models play a bigger and bigger role because there's a realization across the industry that in order for those models to properly work and give useful answers, they need to be augmented with knowledge of some kind. And we believe that knowledge will be created probably in the form of knowledge graphs that are built on top of the modern data stack as a new semantic layer. Okay, let's talk about, I don't know, George, would you say the big four, the big five is the Snowflake Databricks, Google with BigQuery, et cetera. Microsoft now in the game with fabric and obviously open AI. George, you and I have talked about AWS. I don't know, George, where you would put, with respect to the likes of Oracle. Is that just legacy in your mind? Are there any others that we should be talking about, George? Well, my take, especially listening to Bob, is that there's certain requirements which is a sort of a data-centric foundation that multiple compute engines, multiple analytic compute engines can get to, and that this data-centric foundation needs to eventually cover all data types, not just structured and semi-structured, but the unstructured. But that raises this question, which is how many data platforms out there can support a data-centric foundation? And then sort of what's the gateway that manages that? Is it a full-function analytic DBMS that has a rich transaction model and high-end interactive performance features, or can you get away with a lighter weight sort of execution slash storage engine, like the Spark execution engine, and then have multiple analytic engines work on that data? Bob, why don't you tell us about that trade-off? Well, I think what's happening right now is that the data lakes are maturing. We're in the next 12 to 18 months is the period of time where we're gonna start to see what I'm gonna call at least the maturation of the first generation of the data lakes. And I think all the vendors are gonna move to that sort of an architecture where there are two views on data. To me, really what a data lake does is it provides two different views. You have a file view of data and you have a table view of data. And some of the challenges we have right now are making sure that those things are coherent, that there is consistent governance across both of those things, and that as data is added that the transaction model and consistency model is fully understood. And that consistency model will increase in importance as this data is used for things beyond just business analytics and it's begin to be used to actually operationalize systems through data applications. That's where the consistency model becomes progressively more important. And in those ways, I think we're still in very immature stages in the data lake sort of period of the world. But that architecture of using the underlying file system as the foundation with layers like Hadoop, excuse me, like iceberg or Delta on top provides a foundation to work with the multiple types of data. One of the big challenges today is that blasted two different formats of Delta and iceberg and potentially Hootie is a third format that is a bit less in vogue. But the fact that we have a potential beta versus VHS situation is very concerning and problematic to me because what it means is that customers when they're choosing their data lake, they're locking themselves into a format and in some senses they're locking themselves into at least a set of vendors that operate on that. And that's just something to be very thoughtful about. And I hope that the industry works in the next 18 months or so to reconcile those differences because there's no reason for them. They're completely, they exist only because of the lineage of the technology, not because there's a need for there to be different solutions. When you guys, when you think about this sort of new type of operational database, maybe question for you, George, I don't know, Bob, if you have an opinion on it, why isn't MongoDB in this discussion? That's a Bob question. Well, I'm happy to answer it. First, let's talk about, first let's separate this. What I was talking about before when we were talking about the modern data stack, those are, that is the part of the data platform that is on the analytics side. And it is when the data lakes or the data warehouse are the historical system of record for the business. There's another set of applications that are much more oriented towards working with people on an interactive basis that are operational applications that work with data in its current state, the current state of applications and the things that are going on. Now, in today's modern IT data centers, every company has many of these operational applications to run their business. And more and more, those applications are SaaS applications. They're provided by third parties. So you have very little control over that application and what it does. You are able to typically get the data out and put it into a pipeline and leverage tools like 5Trend to move it into your data lake or data warehouse, your historical system of record, but those systems are totally different. Now, in talking about business applications that are operational applications that work with people, the standard for database that people have used for 40 years really is a SQL database. And there's a lot of really good reason why people have used SQL as a choice for that. In particular, it provides the flexibility that the relational model provides together with very well understood consistency model that works well for applications that require debit credit style transactions. So there's that whole world that's existed. And the thing that's most interesting about that world is that you work with tables and that's the data structure that SQL was designed to work with and is optimized for. It's been extended to work with semi-structured data but it's somewhat awkward to do that and it's fairly rarely used inside most of these operational applications. It's more for analysis purposes. Now, what happens is that there's classes of applications that that model does not work well for. We ran head into it at Microsoft when I was running the server group and we're trying to move the exchange product on to SQL server. There were multiple attempts at Microsoft to move both exchange and outlook to SQL server. And they all failed. You wrote about that in your book. It's a square peg round hole, I think you called it. They all failed and they failed because the data model inside SQL is a table and it's very static in its nature and it has to be pre-declared. And if you look at a chat session or a mail message they're very dynamic in the properties they have, how many attachments they've got, et cetera. These things can change on an individual item by item basis. So the data model doesn't fit with the structured table of SQL. And that's why the no SQL databases emerged 12, 15 years ago to handle these sets of issues. And they work quite well in fact for applications that are highly dynamic and in particular, most of them use a consistency model which is a eventual consistency model which is also appropriate for something like email where you can operate offline and you want to have multi-master reconciliation and eventually have things come together. But that's very different than a debit credit transaction where you wanna make sure both parts of that are done or neither of them done. And the consistency model of today's document in no SQL databases is by and large problematic for this style of applications because very, very few, in fact, I don't think any of the major document databases support a transaction between what they would call collections. You know, in a document database, instead of storing things in a table, you store it is a semi-structured document where each level of the tree it can have independent properties. And those are very dynamic in those databases and the unit of consistency is typically a document and those are held consistent. But if you have documents in multiple collections which is similar to what you would have in tables in a relational join or relational transaction, you can't really support that. Well, what Fauna has done is it's tried to take the best capabilities of every modern operational database being a serverless database, providing it as an API, supporting full global consistency allowing you to have transactions that span the globe and be totally consistent and be placed really anywhere you want it. And yet it uses the semi-structured documents, the data model of the semi-structured documents to support the application. But it does so in a way that is fully relational so it has the ability to do joins between documents and collections, between collections. And it also has the ability to do transactions between collections. So it brings together the best attributes of a relational SQL database and the transactional consistency with the data model that people tend to wanna work with in operational apps. And if you look, operational apps are written in languages like JavaScript, Java, Python, other applications, they're all object oriented and the native memory model of those are embedded objects which when serialized looks like a document. In fact, JavaScript serializes into exactly a JSON document. And that model is very coherent for applications and it's very different than what people work with in SQL which is typically a third normal form and they need object relational mappers. I just wrote a blog on this, it just appeared today in the fauna, on fauna, if anyone's interested and you can look under the resources and see a blog where I talk about this, it's called relational is more than SQL. And I talk about this from both primarily from the operational database perspective but then I also mentioned it from the analytics perspective because we're seeing the same issues in analytics where effectively the table, the structure that is the structured data that SQL wants to work with while appropriate for many, many things, many, many things. I always say if I wanted to build a new ERP system I would use a SQL database underneath that for the general ledger. But I might not use it for the marketing system or for the billing system where you want much more flexibility in the data model and the dynamicism is very interesting. Thank you for that, appreciate it, go ahead, George. There's a lot there. Let me key off that because what you're describing is an evolution on the operational database foundation that feeds the data stack, the modern data stack. But let me try and separate now the concerns. Before we had these little databases that belong to each microservice. So you really had silos and the coherence had at least in the data model had to come on the analytic data side in the modern data stack in the lake house. What you're describing is you could at least have a coherent system of record across all the different microservices because they could share a one a spanner like real-time system of truth only it's got the more flexible data model it's got a better transaction model so the application logic can be more stateless. And it's not connection based which is really important when you talk about the serverless applications. Yes, and so then let's talk about the other limitations that still remain. You've got this brittle pipeline even if you have this coherent system of record that's real-time you still have to move that into what you call, I think the historical system of truth on the modern data platform side. Now, those pipelines, the pipeline that moves that is still brittle because any evolution on one side kind of breaks the other side. And we'll talk about potential solutions to that but that's one known limitation. Well, I'll just stop on that for a second because I do think there's a point here. So when your operational system is structured data and the result of what you're trying to move across the pipeline is a structured schema. It tends to be highly brittle but if you're operating in a system that supports dynamic properties it is actually straightforward to build the pipeline so that the pipeline can sense when a new property has been raised and raise that and actually, you know indicate to the downstream applications that there's a new property but it can replicate that property because the output is no longer a table. It's a semi-structured document. And so that semi-structured document if you simply add a property it does not break the existing things. What it does is it says there's something here you're not making use of. So it's far better than breakage. And the fact that it's dynamic because you operate in a world where in the case of Fauna you can add this dynamic property and the downstream side it would be manifested in the SQL database as a JSON object which typically is dynamic inside these SQL systems. And then they would typically flatten that. I mean, almost always the first thing you do with JSON for analytics is flatten it into some form of a table. That flattening would not see the new property but it wouldn't break. Big difference. Okay, that's helpful. And then we'll talk about how we might add technology that makes it so that the consuming side actually might understand some of the additions. Well, and 5Tran has sort of already taken that step because it will populate a catalog of properties. And so if it sees a new property it'll populate whatever catalog you want. And so it's actually not hard to do this. The mechanisms are actually kind of already in place. What's been missing is the flexibility in particular of the source databases. Okay. Well, this is not going to fix the million applications that exist today that don't work this way, right? It's new applications that will benefit from this. Right, and that's going to take some time to develop that base. And it's not going to be rewrites or maybe some of that. But it's going to be the new stuff. Sales force is not going to get rewritten tomorrow, honestly. They've been trying to rewrite that thing for 10 years. But this feeds into that second second time. Yeah, so yeah, Alex, you want to bring up the next slide, if you could. We want to explore how things are changing. We touched a little bit on this with Fauna, quite a bit actually. And what that new modern data architecture is going to look like. This idea of a digital representation of your business. Some people call it a digital twin. New types of databases, similar to what we've been talking about with Bob, relational AI, Fauna, et cetera. Interesting, interested in your thoughts on where AI fits. Bob, how do you see the data stack changing? What are some of the emerging trends that you're watching here? Well, first of all, I think that we've matured. I think that the data stack has matured in a very fundamental way in working with structured and semi-structured data. And it's actually become pretty effective. I mean, it could be better, but it's quite good. And the biggest area that I would still give us maybe a C grade on is the simplicity of governance of this solution. Governance is to me still the outstanding issue that needs to be addressed more by the industry. And that problem is only exacerbated by the appearance of these data lakes where you have a table view and a file view and you need to keep those consistent and properly governed with only the right access to data. So those are some of the concerns I kind of have today. But the modern data stack is very good at working with structured data. These modern SQL databases, whether it's Snowflake or BigQuery or a new fabric, I mean, they're good at slicing and dicing data of all sorts of sizes. And they've become effectively a very good kitchen tool to use as you're preparing your data meal. And I don't think they're going to go away. I mean, they're foundational and they're going to be there for a decade or two decades, a long, long time. What's obvious to people though is that there's a lack of semantic information about the data. And even more fundamentally, there's no semantic information or very little information about the business and the business rules themselves. And that is to me the big missing piece of the database. And it's also very directly related to the governance problem. Governance suffers from the fact that we don't currently have information about our data stored within the modern data stack in a way that's usable. And here the real challenge is that, again, I keep coming back to the shape of data. And we're so used to working with tables and tables are very good things. They're very functional. You can use them for a lot of things, but they're not good for everything. As I pointed out earlier, there is attributes about operational databases where by working with semi-structured document data models, you can get improvement. In the case of analytics, in particular, to try and work with semantics associated with it, you just can't store these semantics in tables. It's just not of an effective structure. You need something much more granular and that's where this concept of a knowledge graph comes in. The idea that you can create a graph of data and represent the data or the business or whatever you do. The truth is you could represent anything effectively in a knowledge graph and you would represent that as nodes with properties and then relationships associated with it. And what you've discovered is that they're very, very complex data structures, much, much too complex to handle in a SQL database. And so I've been working with the team at Relational AI for a little while and they're building what I think will probably be the world's first relational knowledge graph. And that leverages the underlying relational mathematics that go beyond what SQL can do to be able to work with data of any shape. And in particular, the way they actually structure data, where fauna structures data as a document because that's the natural data model for operational applications. RAI actually puts data in what's typically called sixth normal form or what we now think of as graph normal form. It means it's the most highly normalized you can possibly make data so that you can work with it and slice it, dice it, literally any which way. And relational mathematics providing credible power to do that, including power to do recursive queries where you're going through levels of hierarchy and things like that. And that's all possible to do, but it requires a whole new set of relational algorithms and they've been under development and the team's working hard and we have expectations that will be out next year with a great product. So Bob, let me follow up on that because let's put that in the context of the limitations of today's data platform and what we need to support the intelligent data apps of tomorrow. So if you've talked about limitations on governance, so let me try and say like Unity has a, Unity made with Databricks a big splash about heterogeneous data governance and not just data, but all the analytic data artifacts. But one limitation is like you can't represent, can't represent permissions policy that might be really, might require like deeply nested recursive queries to support. Every permission, actually requires that the problem is the role, if you look at the way roles are structured, groups are structured, it's a hierarchical structure which means that it requires a recursive model. This is the challenge, right? I actually think, I think it's great that the Databricks is doing the Unity catalog. I saw that announcement and thought that was a good thing but it's just a step. It's just a step in the direction. They suffer from the same challenge that everyone else suffers from which is we don't have an effective way of representing this data model as we sit here in 2023. And that's what I'm trying to find the limitations of today's approaches also that you can't build the shared semantics that then all new data apps would build on for building like an Uber like app where any object like a fare or a rider or a driver needs to be calling on any other app. So maybe tie that into why today's products couldn't, well, I guess you sort of have but explain how once we have that foundation how that would make application development and transformation. It changes everything. I mean, when we have a economical, usable, relational knowledge graph that takes the power of the relational mathematics and applies it to data of any shape and size we'll be able to model business for the first time and actually create, you can think of it as a digital twin of the business. Think about your, any business, think about your business that you're in and the business that you're a part of and think about all of the attributes that it takes to run that business. What are the rules? What are the, how is the pricing done? All of those attributes. Where do those attributes exist? Where does the knowledge of that exist? It exists inside applications many of which are SAS applications as we described earlier. So you don't even control it. And you really, it's somewhat opaque what that logic does in most cases. It exists inside programs that maybe you wrote it exists inside SQL queries and BI tools. It exists in Slack messages, whiteboards, people's heads, documents and documents become very interesting because a lot of the stuff is written down, it turns out because there's a lot of instructions for people. And I actually think, I think one of the realizations that I have come to this year is that the root of the enterprise knowledge graph is the documentation that's written in human language, we'll just say English for the moment but it could be any native human language. And that is the understand that is where most of the understanding of the business is. What we want to do is translate that informal written set of documents into formalized rules that can be can be put into a database in the form of these knowledge graphs that describe exactly what the business is supposed to be doing. Now, eventually it is possible for those rules to be fully executable and actually run aspects of the business within the database and the knowledge graph because knowledge graphs have the ability, relational knowledge graphs have the ability to execute business logic. But realistically in today's world where we have dozens of existing systems we're not gonna replace those things overnight. So I think what we'll wind up doing is defining what the what we want it to be and that will become a management tool to ensure that all of the components in the organization are operating with the desired behavior that you have. So you define essentially the desired behavior inside the knowledge graph and the semantic model and a system, a governance system will ensure that every part of the organization, the system is actually operating consistently with that. For example, there aren't missing permissions or something like that. Okay, so I was maybe thinking of it the other way where those existing operational systems would feed and maybe this is what you're saying, feed the knowledge graph and then I could interact with that knowledge graph maybe with a natural language interface, ask it questions and get answers to things that today I have to hunt and peck for because it's in Slack or it's in Salesforce or it's in somebody's head or it's in a Google Doc. How do you see these emerging approaches, supporting that vision and what exactly do these new apps look like? Exactly what you just said, Dave is that once you have this knowledge in a centralized place, it becomes a repository that people can use to understand the business. And now that we really are seeing the breakthrough of the large language models, it's understandable how you could use human language and ask a question and have a model respond. But you know that in the world of language models and incorporating those models into enterprises, the current sort of in vogue thing to do, which is a great thing is called RAG, Retrieval Augmented Generation, where you take corpuses of data, typically one approach that's being widely used right now is to take a tool like Pine Cone and vectorize that data in a vector database and then use this and because you now have a semantic representation of it, when a question is asked, you can query that source of knowledge and then feed that into the language model for the answer. Well, the other major source in addition to vectorize data, which is just human language written in a written form that's turned into semantic vectors. In addition to that, the other place where these models will be getting augmentation is from knowledge graphs. And in a way it's the first place that happened. If you look, Google has been doing this to some extent for years in search, Google has a knowledge graph that it shows you whenever you ask about a company or a place, what it shows you is the knowledge graph of what it knows about that place. And now what these search engines are doing is they're incorporating that knowledge into their language models and so that can be used to answer questions in a more accurate way. So, the reason George, just to decide, part of the reason I was asking about Mongo, one is we talk about operational, new operational databases, document databases, but we actually built Bob and George theCUBE AI. We took all 35,000 interviews and videos that we've had done over the last 10 years. We've always transcribed them and we built this retrieval augmented generation, this RAG system that uses Mongo. It uses Milvis, it doesn't use Pinecone. And you can ask it a question, it'll give you an answer, it'll give you actually clips of where that answer came from. It'll actually generate clips automatically. But I would think that over time, I could- Is that working pretty well? Is that working pretty well? It's working, it's really interesting. You go to theCUBEAI.com and it's in private beta. It does hallucinate, like if you ask it, like what does Bob Muglia think about the future of data apps? It'll give you an answer and it'll give you clips. It'll maybe say some things that you didn't say, but it's a really well-written answer. And so we're tuning that, but my thinking is that- But you can tune that down, you can tune that down. Absolutely, yeah, and we're doing that with all the, you know, we're getting great feedback from the private beta. But it would seem to me that you could consolidate a lot of that around what today for us as Mongo, I would think we could do some, whether it's vector search or even maybe extended, or what you're talking about with some of the future platforms to really simplify. But I mean, we built this app, Bob, in literally weeks and had it into an MVP and it was very inexpensive to do. So it just blows your mind as what's possible. That's what's sort of exciting. That's what is really exciting about this new generative AI and what we've seen in the last year. I mean, I'm so excited about it because for the first time, there is a way to effectively bottle intelligence, to take knowledge about a subject and actually put that intelligence that what it's required to make that work inside one of these models. And the tools are still rather crude for doing that, but they're improving at a very fast pace and the models are improving. It's a very, very exciting time. But what is very clear is it's the combination of that intelligence that the model provides together with knowledge in some form that is gonna make these things work really well. I mean, I personally have found that I've moved, chat GPT is great, but I like tools that are much more up to date. And I've been using perplexity to answer questions and it's pretty cool the answers you can get. I mean, it's current, these things are up to date and we're going to start to see more and more consumer and business oriented solutions that leverage this technology. Yeah, sorry, George, I took a little turn there. Well, let me follow up on that because Dave was talking about using, you know, the large language model on top of like an existing data base. And my question is now let's extend that to we have a relational knowledge graph that represents a model, a coherent model of the business. And then underneath, it organizes all the historical system of truth. Now, I have two questions. First, can you use an LLM, not a raw LLM but one is a development assistant, a copilot, a coding copilot to help you build these digital twin applications on top of that system of truth? That's my first question. Would it accelerate the ability to, you know, build these apps? Yes is the short answer. And in fact, it's super important because one of the learnings we've had as we've built relational AI is it's really hard for people to build these digital twins and to think in terms of these semantic layers. It's not, we've not really been trained, most of us at least have not been trained to do it. The people that are the most trained are the business analysts that are really focusing on, you know, understanding the processes associated with a business. There are people that are, and they're often consultants that you hire and they tend to think this way but it's not something that most people learn as they're, you know, when they were in college or when they're in their early formative days. So it's tough for people to kind of get that. And I think these language models will be very helpful. I also do think that we're gonna just start to train people differently. I think that, you know, that we'll begin in the next few years to talk much more about declarative programming. And, you know, we do, everybody does it. People learn imperative programming today where you take one step and you take another. One of the attributes of these knowledge graphs is that, you know, you're not declaring you do this first, you do this first, you do that. You declare rules, business rules. And the system determines the order that things happen based on the data that it has. And that's a very different way of working with things. But you said something really important in there which is the people closest to that mindset of business rules are the business analysts. But there used to be a skills gap between the language they spoke and translating that into code for applications. And the LLM can help map the business rules that a business analyst understands down into a more domain-specific language like RHEL or whatever your- I think the LLMs are helping today with imperative languages too. I mean, the co-pilots be one of the most effective, the developer co-pilots is one of the most effective first generation AI solutions. And I think we'll continue to see that, but it is a shift again to move to much more declarative to much more declarative way of describing things. So I got to take another tangent here, George, while you're thinking about the next sort of vector go. I asked the cube AI, Bob and George, what does Bob Muglia think about the future of data apps? I'm going to read you the answer. I'll maybe try to paraphrase. Bob Muglia believes that the future data apps lies in a shift from a code first approach to a model-driven approach. He emphasizes the importance of companies like DBT Labs in an incrementally building on the modern data stack and adding a semantic layer that describes the data. Muglia envisions a world where organizations define models first of their data and ultimately of their entire business. He sees this as a major change in the way we think about writing applications rather than writing code to produce APIs that encapsulate data. Muglia believes that organizations will define models that encapsulate their business rules, logic and data. He acknowledges APIs will continue to be useful and play a significant role, but organizations will increasingly focus on defining models to drive their applications. He suggests in the next 10 years, organizations will move toward this model-driven world where models define both data and business process. This shift is supported by the developments in machine learning and the increased use of learning models, learned models and applications. He also highlights the need for infrastructure that can handle both these learned models and explicitly defined models with relational knowledge graphs playing a key role. Overall, he believes that the integration of data and AI is inevitable and that data apps will become closer as AI technology advances. I don't know, Bob, how would you rate that? Good, I could have said that. I could have said that. I could have felt like I said that. It felt like it's repeating a bunch of things I did say in fact. But the interesting thing, the really cool thing is it gives me many clips where you're quoted and references with actual clips and snippets. That's the cool thing and that's what these models, that's what these more augmented models and we see it in the consumer search applications like Bing or Perplexity or Bard, things like that where you have references to the underlying content which is incredibly helpful if you want to dive deeper. I mean, it's a totally different way of working with information. I don't know about you, but my entire search, my entire approach to internet search has changed completely this year. Totally different. I ask a model first. 100%, absolutely. And I'll use these tools to just make myself more productive. I can do a lot more breaking analysis with things like AI. He finds stuff out so much faster. It reminds me, it's at least as big of a transition as the internet was where it used to be you had to go to a library and stuff to find these stuff out. I mean, I remember those days and now you... But Bob, let's talk a little bit about trying to extend today's data stack to support this new class of applications and what that pathway might look like. So companies might start marking up the semantics of their data with BI metrics. Like that's an easy first step. Can that support growing into full application semantics? And if there might be limitations in that growth path, how might customers sort of make that first step and then transition to something broader if necessary? Well, it is a good first step and people should do it. And people have been doing it frankly in BI layers for a very long time. This is nothing new. I mean, Cognos, I think, you know, this was one of their big things years, you know, business objects. I think they were talking about this ages and ages ago. But it is a good step. I think that getting to those full semantics is gonna require, I continue to say that we need the database to underlie it. This week I've talked to two or three different really small companies that are looking at collecting data semantics, you know, and I asked them, where are you storing it? And they tell me they're storing it in a YAML file. And I go, okay, that's not a bad first place to store it, but it's not a long-term repository. And obviously it's just not gonna scale. The problem you have is you can operate it really small scale with just working with files, but you eventually have to be able to put these in something where you can work with the data through, you know, much more normal relational commands. Okay, following that up. I'm just focused on fixing the underlying infrastructure. That's my thing. I gotta get the infrastructure right. You know, the whole modern data stack couldn't exist until Snowflake demonstrated that it was possible to build a system that could work with structured and semi-structured data at any scale. I mean, and until we demonstrate using SQL, until we demonstrated that, you know, nobody believed it was possible. This is likewise, we need a new generation of technology and databases to be able to move the world forward. And we're just waiting for them to finish. They're very well, they're very much underway. Okay, along those lines, where might someone get into trouble? Like a Databricks or a Snowflake or Microsoft, if they try to start modeling richer semantics than just BI metrics and their dimensions, where do they hit the wall? And then what would you layer on top? If you had to- They hit the wall on queries. They hit the wall on complex queries. Interestingly, people are tending to use, and there's two types of databases that people are tending to use to store this in today's world. They're using either a graph database like Neo or Tiger, Neo4j or Tiger, or they're trying to use a document database and store it because if you think about these properties, they actually, most closely, although it's not a hierarchy, it is a graph. It truly is a graph. A graph is effectively, a hierarchy is a subset of a graph, essentially. So leveraging documents makes some sense. But here again, they hit the issues of transactional consistency unless they're using fauna and that query is not that powerful. I mean, query is much, much more restricted in these systems. And so that's where they tend to hit it. I keep saying, the problem everyone had, the problem of customers have always wanted me to solve, which nobody can solve today, is Fred Smith was just terminated by the company. He was turned off. I mean, Octor, whatever they turned him off, he has no access. Before he left, what data did he have access to? It's very hard to answer that question. Very, very hard to answer that question. Okay, that's a governance challenge. Tell us about the expressiveness and the richness of the semantics that you have in BI metrics layers and then where you might have to go beyond what they can do. What happens is that if you start modeling, if you start looking at the models of these semantics and the relationships that exist between the data elements, they become a graph. It just becomes a graph very quickly. And so now you have the question is, how are you going to store that graph? And that's why I say you can use bespoke graph database like Neo4j and it will work to a point. The challenge that these things to have are scale oriented challenges and largely because they are not relational in their construct and they use pointers essentially. And so to scale that becomes very, very difficult. Or people try and use SQL. A lot of people try and do something with Postgres, but they hit brick walls with that. It's the classic issue you hit with database technology when the technology has not matured. And it's generally speaking when the relational technology is not matured to solve a business problem, people try and solve it with the existing things, but it is like hitting your head against a brick wall. I remember this is the way it was in the late 70s, early 1980s when I first entered the workforce and we were working with hierarchical and network databases. They were a bear to work with. And SQL made everybody's life so much easier. And it will continue to make people's lives so much easier, except for these problems where SQL doesn't address like these highly complex relationships that graph oriented problems. And metadata, it turns out semantic layers is one of the most interesting problem. There are many other problems by the way in business that model as graphs. The chemical industry has a lot of problems that look like this. So a lot of data problems need to be solved by graph that are not management and semantic models. But the one that to me hits everyone is the semantic model. So let's... Guys, we got to wrap up here, George. Go ahead, one more and then I'm gonna... I wanted to say, I wanted to ask about once we have this semantic foundation and the richly connected data for both all the governance services, but also essentially the application logic now and definitions that are shared across all apps. You talked at one point about documents containing the semantics and the rules of a business. What might these end-to-end applications look like when you captured the document flows? Like so that what was unstructured, but really now LLMs can pull the structure out of those documents. And they're part of applications that span what were office workflows, but they also have operational capabilities in them. And you have this new stack, what can you build? I think what you're gonna build first and foremost is the governance and management layers that lets you understand the organization. And then I think from that, you're gonna begin to derive applications that you can build that are focused on given areas and take on some parts of your business. If you look at, again, you have SaaS applications that are running your app operations, some bespoke applications, you have this modern data stack that you are deploying or you've just deployed and you're running with. And those aren't gonna go away. None of that stuff's gonna change. You're not gonna get rid of that stuff tomorrow. It's stuff's in there for a long time. So what you do is you augment it with systems that focus on solving problems you can't solve. And that'll first work on, I think the governance thing, but then I think we'll begin to use these knowledge graph databases to build data applications themselves. And we talked about this a little bit before the show. We haven't had chance to get into it too much during the discussion, but one of the interesting things is what is the consistency model of the analytic data. And this is focusing on transactional consistency is incredibly important in data. And it's particularly important if you want to operationalize that data, if you want to actually take action based on that data. The consistency model kind of can work out self out if you're looking historically over time. But if you're looking at data that's coming in and trying to in near real time make decisions of what to do, keeping that data consistent is very important. And that's why understanding the consistency model is critical. An analytic database uses a snapshot level typically of consistency, which means each table is consistent itself, but it isn't necessarily consistent between the tables at the same time. And that's why some of these systems start talking about using technology like strictly serialized, which is a much easier model to work with in building applications. Today products like materialize support that as an operational data warehouse. And I think more and more that's gonna be important. It turns out that relational AI in what they're building with a knowledge graph is actually strictly serialized. And that's very important because that system will be used to build applications. And again, having that high level of consistency is very important. So this is something people don't pay enough attention to and it's really important, really important. So Bob, last question, two part question. So this notion of Uber for all that we've been putting forth, do you buy that premise and how long, if so, how long do you think it will take to unfold? Well, I do in the sense that, I mean, for all is a big word, right? I mean, Uber is a very complicated SaaS application as many, many complicated things. Frankly, that some don't require. I mean, it has complexity that many companies may not require, although certainly many other companies do. And if you look at any kind of transportation industry, et cetera, they have very, very similar sets of requirements. It's getting a lot easier to build these things. There are components that you still have to stitch together. I think over time, more and more of them will be incorporated into the modern data stack platforms that let people do it. In addition to the things we've been talking about today, I mentioned earlier that before we started that one of the tools that came out of Uber that turns out to be very critical for that style of application is something to support long-running transactions. If you think about what an Uber driver has to do, they're called to pick up a client, they pick them up, they go through a route, they go through the stages, they finish it. That is thought of in Uber as a long-running transaction. And the tools that Uber developed were open sourced and now the company that support that is now driving that forward is called Temporal and they build a system to help support that transaction system. Again, it comes back and I'll say this again to consistency levels. Uber understood the importance of transactional consistency, which is why they built the underlying foundation that is ultimately in Temporal today. And you need that level of consistency within your application and database systems. And Uber even went so far as to build these long-running transaction systems, which are now available for others to use. Okay, so not everybody needs that level of capability, but something like that people places in things, that representation of your business. Is that a decade-long sort of journey? Well, it's shorter than that. I think it's much shorter than that. The fact is, the biggest criticism I think everyone has today and it's a good criticism is that the modern data stack has a lot of components to make it successful. And you have to buy things from a lot of people. Over time, those things will tend to consolidate into the platforms and that will change, say, over the next five years. But the thing that's interesting is that the pieces are there today by and large, with the exception that I keep coming back to, to the knowledge graph, which is still a missing component. The pieces are by and large there together to build these solutions today. Awesome, Bob Muglia, always a great guest to thank you so much. The data printers, if you want to go deeper into Bob's head and his life and his perspectives, definitely pick up a copy. Thanks so much for your time and thank you, George. Thanks a lot. Thanks. All right, I appreciate you watching. This is Dave Vellante for Breaking Analysis and we'll see you next time.