 From theCUBE Studios in Palo Alto in Boston, bringing you data-driven insights from theCUBE and ETR. This is Breaking Analysis with Dave Vellante. The recent Databricks data plus AI summit attracted a large audience, and like Snowflake Summit, a strong focus on large language models, unification and bringing AI to the data. While customers are demanding a unified platform to access all their data, Databricks and Snowflake are attacking the problem from different perspectives. In our view, the market size justifies the current enthusiasm seen around both platforms, but it's unlikely that either company has a knockout blow for the other one. This is not a head-on collision in our view. Rather, Snowflake is likely years ahead in terms of operationalizing data. Meanwhile, Databricks likely has a similar lead in terms of unifying all data, all analytic data types, e.g., BI, predictive analytics and GenAI. Hello and welcome to this week's Wikibon Cube Insights, powered by ETR. In this Breaking Analysis, we follow up last week's research by connecting the dots on the emerging tech stack that we see forming from Databricks with an emphasis on how the company is approaching generative AI, unification, and of course, governance, and what it means for customers. Now to do so, we tapped the knowledge of three experts who attended the event, Cube Analyst, Rob Streche and George Gilbert, and AI, Market Maven, Andy Turai of Constellation Research. Gents, welcome to the program. Thanks so much for your time. Let's jump right in and unpack the Databricks conference and what it implies for the company's emerging tech stack. The three big themes of the event are seen in the left side here. One, democratization of analytics, Lakehouse IQ, that's the company's knowledge engine. Marketplace is all around data sharing with partnerships like Twilio, Dell, and Oracle of all firms. And Lakehouse apps, which is really around safely running apps. Second line there is a platform for gen AI, app development, the Mosaic ML acquisition, I'm sure we're going to talk about that so you can build your own models. Talked a lot about vector search, model serving, the Unity catalog for AI, where AI features and functions are integrated. And then the third, all important line, governance for everything. The Unity catalog with federated mesh, they talked about data mesh a couple of times and monitoring tools. And then the Lakehouse 3.0, which unifies all the different formats like Delta, Hoody, and Iceberg under Parquet. So guys, the keynote from Ali Ghazi was in my opinion, very, very strong. Matei Zaharia participated as did Nuvine from Mosaic ML and the other co-founders. You had JPMC up on stage for quite a long time. Very, very strong, as well as JetBlue and Rivian, not as compelling, but still obvious good use cases. And that was just day one, which was followed up on day two by Mark Andreessen and Eric Schmidt and other content, so really, really meaty program. Rob Streche, let's start with you. What were your thoughts on the keynote? Yeah, I thought the keynote was excellent in both days. I thought that they really did a good job. I think beyond Ali, I think got a little thrown off with having Satya on stage with him a little bit and kind of hurried through his part a little bit, but had a lot of compelling information about the direction that they wanted to take those three areas in. And I think a big piece of it was how do we own the analytics and start, I think even move up the stack. There wasn't a lot about data warehousing. There wasn't a lot about data lakes or Delta lakes and Lakehouse. It was a lot about how do we go forward from here in the use cases that they were using, not a lot about how they were really looking at the specifics underneath. So I thought that was really compelling. I think when you got into my favorite part from day two was really Eric Schmidt. I think he nailed it. I thought his part of it was just phenomenal. I think their demo of Lakehouse IQ was killer. I think one of the best demos I've seen in a long time out of them. Great, thank you, Rob. And then Andy, how would you generally grade? First of all, do you agree sort of anything you'd add to Robin? How would you grade the AI, gen AI focus, which was a big, big part of day one keynotes? Yeah, first of all, thanks for having me. I appreciate it. For a minute, I thought you called me Raven. I'm like as in Game of Thrones. I'm like, what are you talking about? So in any case, so I agree with Rob about the Lakehouse IQ demo was really, really good. So look, at the end of the day, what they're trying to do is that, as we all know, Databricks is all about going out to the data science people who are trying to build the models, right? So they have the data and then you appeal to the data science audience. You build that. And then also the secondary portion of that, what they had was towards more of the ML engineers or who will try to help productionize that. So they appealing part for them or their user person on a sort of threefold. One is the data engineering folks who will help you with all the data lineage, data governance, data security, accessibility issues and appealing to that folks. And then data science for building and creating the models. And then towards the ML engineers who will help productionize the MS models, ML models rather, AI, ML models. I think they had something for every one of the personas. And then by throwing in the new acquisition of Mosaic ML and the data unity catalog, they are trying to combine all of that and try to put ahead of themselves in the three horse race between them and Snowflake and MangaDB. I think I thought it was quite powerful, though I would have preferred to see slightly more inclusive opportunities, particularly because they are trying to go after the open source modeling and all that. I would have loved to see their integration with maybe hugging phase or with the other data source types like the hyper cloud providers. There are some areas that are weak, but I think overall it is a very compelling announcement. Great, thank you. I mean, we're basically trying to take two events which were over six days and jam them into a breaking analysis. But I want to, George, let's bring back the first chart and take us through this detailed diagram that you created, talking about Databricks emerging technology stack. George was at both shows, Andy and Rob were in San Francisco at Databricks. And so, George, you kind of caught the tail end, I guess, of the Databricks show but enough interaction with the ecosystem. So take us through this diagram. What are the salient points? Okay, so the key point is Databricks took their weakness and made it a strength, which is they're not as strong in managing data. Like there's an analytic data engine, but what they made their strength is this ability to harmonize and unify all data assets and analytic data products. And we've been talking, Dave, a lot of the last couple of months about the semantic layer and how you're going to need that if you want to build digital twins about the things in your business. And something's got to translate that down into the strings that a data manager takes care of. And the company that has done this best is Palantir. And then another friend of theCUBE is Enterprise Web. Both of these build applications on a workflow and semantic layer that allows people to build digital twins. And the Lakehouse IQ product, which Ali called the future of Databricks is it's a backdoor way to start building a semantic layer by having an LLM read the tables, the notebooks, the queries, the dashboards, org structure, all this technical data artifacts it takes and it starts to extract the business meaning of data. And that's what allows you to translate English in the business terms of every particular company into the technical data artifacts and nomenclature and org structure. It's much broader than the BI metrics that we've been hearing so much about like bookings, buildings and revenues, but it's very broad for now. And not as deep. For instance, this wouldn't replace a BI metrics layer like at scale or DBT labs or prophecy has built something similar to go deep and accelerate the development of data pipelines. But what you will do in the future is treat this as a platform for building LLM applications where you talk to the platform in English, let's say, and you're accessing an LLM and you're orchestrating it with Lang chain. These are, this is the gen AI applications that are coming and Unity itself as a catalog, which is the foundation for this, it's more than just a discovery engine, which is what traditional catalogs have been for IT. This is now, it doesn't just track data products and show you things like lineage, but you can set policy for permissions and it can push it down into things like Redshift or Snowflake. It's not there yet, but they announced it and it sounds imminent. Anyway, so this is just to recap really quickly. This is having this knowledge engine is what allowed Microsoft to take their business chat co-pilot and work across all the office documents in an enterprise. Databricks is doing the same for all the analytic data artifacts. So there's an interesting comment in here, Rob, that George has in the slide. Can Databricks SQL use AIML to short circuit Snowflake's performance lead and supporting BI workloads? A lot of skeptics, including us initially around that, but the reports from various companies are pretty positive. And then the other point, Uniform Tables as a universal repository for analytic data was a big theme at the show. Yeah, I think that what they're aiming at is how you bring it all into one house, as they would call it, and how do you see it format it and be able to work with it? Because I think to George's point, Databricks and to Andy's point for that matter, Databricks is coming at it from more of the Python side of the house, whereas really, if you were a SQL person, you went Snowflake. And I think what they're trying to do is harmonize that by showing, in fact, the demo of this, where they showed building SQL, they then transformed it from SQL into Python using Lakehouse IQ. And I think that was showing that SQL first can be a way to go. And even if you don't know the Python side of things, they'll help bring you across. So I think that also helps with the skillings gap that they're seeing from a data engineering perspective within the personas they're selling to. And Andy, the Mosaic ML announcement obviously was big. The reported number, the headline number was I think 1.3 billion or so, but everybody sort of parsed through that and said it's actually at a low, today's current valuation or sorry, the previous valuation was like 38 billion. So you can kind of cut that roughly in half. So it's really a six, let's call it a $650 million acquisition. And then ironically, Frank Slutman sort of answered the question as to whether or not companies like Snowflake and Databricks are overpaying for these assets. And Slutman basically said no in a large part because of the talent that we're getting out of these things. So Andy, your take particularly on the AI and Gen AI component of this sort of emerging technology stack. Yeah, so I got a couple of thoughts on that, but before going there to the point about the Lakers IQ and apps, one quick point I want to mention about that is the reason why some of the data on data warehouse, particularly companies like Oracle has become really, really famous and tell them to the enterprises because they move that the database instead of using it at the data multilevel data level, they tethered that with the applications, which means you just can't untether it and remove it and go somewhere else. And Databricks is kind of trying to do the combination with the Lakers IQ and Laker apps of democratizing app building and moving it to the player. So when your app was tethered to the enterprises, to Databricks, instead of at the model level what they're creating, that'll create a lot more traction. So it'll become a lot more valuable, right? So that's a quick point on that. To talk about the Mosaic ML, I did ask Ali when he was presenting, I did ask that question Rob was there about time and why you're paying this much and what are you expecting? He had some answer to it, but my thought on that is that for a couple of reasons they are getting that, right? One, at the end of the day, they want to show people that what you don't have to, so Microsoft way of doing it is that I'll build a model for you, I'll host it for you and then you come and use it as much as you want, which could get pretty expensive. But the way these companies, the other companies, the data companies or the model companies, what they're trying to do is that I'll give you an option to build your own LLM using your own data. And they are thinking, their way of thinking is that by doing this Mosaic ML that I can tell any enterprise you could build your own LLM much faster way using your own data and run your own instance on our platform, that's a whole idea. I help you train the model, you run on my platform, you also do, people are not talking about the inferencing cost, depending on how much of inferencing you do for the models, that could also run fairly high. So I'm going to give you end-to-end model, that's the goal for Databricks to go after. It's not just the data scientist person as I was talking about that I'm interested in. I have your data, I want you to move your data anywhere else, and I provide the data lineage and model lineages and data governors and security at the enterprise grade level, which they have proven already. And then I hope you create the new model LLMs and inferencing on my data. And then with the MLflow and then MLOps section they got I'll also help you production those models. So they want to become one-stop shop or AI ML model data to model to model governance to end-to-end lifecycle. Yeah, and they're going to achieve that, we'll see. And they made it well, they also made a big point about the lowering, the cost of doing that, democratizing that. There's a couple of schools of thoughts. I mean, Jassy was on TV recently saying that there's only going to be six large language models. I was talking to George Kurtz and he was basically saying, look, these things are going to get commoditized. The real values in the data, bringing your data to those, bringing AI to that data is really where the differentiation is going to be. But you had a thought on this? Yeah, yeah, I think that really back to the whole Delta uniform concept that they came out with in product, I think it was really interesting how they were bringing, not only Trino, another open source project that's used for Datamash into the fold using Delta Lake and being able to bring that into the ecosystem, it was the Oracle stuff. I mean, it was not only Oracle was Adobe and moving from iceberg tables to Delta tables using this process. I think there was a whole bunch of how do you look at the entire ecosystem? And I think they saw iceberg and I think both companies actually saw iceberg as a potential threat to both of them. It's somebody coming out with a product that was an iceberg based product to the point where they did a demo at Databricks showing how they could actually provision iceberg tables faster using their tech than you could within native iceberg, which was pretty crazy in that respect. Well, in Snowflake last year, that was sort of, there was a Delta, if you did it inside of Snowflake, it was faster. Now they're saying they sped up the native iceberg. So this is kind of, you're right. I mean, iceberg is very, very important to both of these companies. Okay, let's bring up the next sort of detailed graphic which describes the differences and similarities between Snowflake and Databricks. So George, we've got the differentiation, on the right-hand side, and we'll stack these in the written piece so you can see them better, but Databricks and Snowflake, now both unifying analytics. George, you use the example of Snowflake dominating like Oracle, which of course Snowflake would bristle at that, but there are similarities and of course the founders came out of Oracle and you have another analogy with Databricks as well. But so why don't you take us through this chart and explain all the beautiful detail here? So Snowflake's belief is that all the analytics in the world aren't great if you can't like then either push a button or have it automatically operationalize in the form of a transaction. So it's sort of like when we had the web and you had fancy catalogs on web pages, but there was no button to buy something. Similarly here, if you want to take just an analytic data product like a dashboard, whether it's a business intelligence dashboard or something that has predictive analytic in it, you would have to embed it in a external application to actually do something. Snowflake is like put it in our stack and you have one database management system that both serves the analytic and then turns that into a transaction that does something. Databricks is saying, well, we'll give you this application layer as Andy was saying with this knowledge engine which is really the semantic layer, Lakehouse IQ, which is very much like Palantir. Palantir sits on top of legacy applications and that's how you build your digital twins. That's at a high level, the different approaches. The key thing is Databricks really has captured the hearts and minds of data scientists, ML engineers, people working with semi-structured unstructured data and you could feel the energy of those people being, those people wanting Databricks to take them into the generative AI revolution. That just that energy and moving people who already are working with semi-structured and unstructured data, even if the Databricks technology isn't five years ahead of Snowflake in that area, it's the mind share that they have, that they've been working in Python for years and they have a fairly mature set of libraries and now tools to add on that. I will say one last thing that some of the stuff about everyone wanting to build their own LLM, you're gonna have these big LLMs that are general purpose that are very high end and it will make sense to have specialized ones that are trained or fine-tuned for certain tasks that are much cheaper to run and it's the tools for doing that that both Snowflake and Databricks are trying to provide. Thank you, George. You know, Rob, I want to bring something up that George and I started to explore and I think you and I have talked about it as well. When you saw the first version of this chart, you were sort of encouraging George to pop the data engineering up to a higher level and I think that's what the dotted line implies but there's a discussion that's going on that I think we started last week which was some of the customers that we talked to said, you know, we want to do a lot of that data cleansing and data engineering outside of Snowflake because it's cheaper and then we'll bring it in. Snowflake says the exact opposite. In fact, I asked Frank Slutman about this. He said, no, no, that's not true. Our early examples are saying it's way more efficient because you get all the governance and you get all this wonderful stuff wrapped in our promise essentially. There's a nuance here which is that Snowflake charges for the, they bundle in the Amazon charges. Databricks doesn't. So we're trying to figure out, okay, is it perception because you don't see the Amazon charges as part of Snowflake? It makes it look really expensive or is it actually cheaper to do outside of Snowflake and then bring it in? As well, Databricks is encouraging people to do it all inside of their data platform, inside of the OneData platform. What do you make of all this? Yeah, I think that when you look at how Databricks architecturally has gone to market using a lot of S3 or object-based storage underneath them, it behooves them to push on this as a cost competitiveness issue because for them, it helps them with those cloud providers. They're selling more capacity, but it's at a cheaper price. So when you start to look at how the stack was built, you know, again, ahead of time by Snowflake, it was really built on EC2, EBS, and some then were begrudgingly going into S3. And so I think they're using, the stack that they use underneath the hood is more expensive just out of the gate. So I think there's really, there is certain things which is why last year they did embrace bringing things together using iceberg tables, being able to push into S3, that gives them that cost competitiveness. And I think that's also a big piece of their whole message this week when the first couple of the keynotes that I caught was really how do you really drive down the cost? How do you look at it not being as expensive? And I think that was the key. It's interesting. You know, if you go back to the database wars, you look at Oracle Informix, SiBase, IBM, it wasn't blatantly obvious who was gonna win that war back then. And so it's interesting, George, to hear you sort of talk about Snowflake in an Oracle-like context. It's very unclear right now. And as you say, the market is so big and these personas, Andy, to your point, are so different today that it's not on a head-on collision. There seems to be opportunity for both, particularly Andy, when you bring in the LLM piece. What's your bottom line on Databricks and what they presented from the standpoint of Gen AI and LLM at the show? Yeah, so before going to a quick point about cost, I didn't actually get the feeling that Databricks is any actually a lot cheaper. So I had a conversation with three large customers at the conference. And all three of them actually suggested actually two of them are Snowflake customers, big Snowflake customers. And they are actually moving into Databricks for a couple of reasons. One, building models for them were extremely easy to use Databricks versus Snowflake, what they're doing now. And two, these are customers we are talking about anyway between 200 models and production. So the largest one had about 2,000 models in production. And they were suggesting their cost overall cost was a lot cheaper for them to do. Again, potentially Databricks showcase their customers to talk to me, but that's what I heard. So the cost-wise, I think Databricks might have an advantage when you're talking about machine learning and AI model production getting into the market and so on and so forth. Talking about the Mosaic ML, I see the reason why they acquired them is probably because of a couple of reasons. The one, obviously Mosaic ML is known for their open source LLM as you might know, it's called the MPT LLMs. That actually, I didn't realize that, but that's the most downloaded open source LLM as of today, about close to three and a half or four million downloads. So the user base right up there with four million data scientists or the users, even casual users, bring them into your fold. That's really a good key acquisition because of that. And the second one actually somebody suggested which I didn't realize until I had a deep down conversation. Mosaic ML actually somehow they acquired a group of or have a pile of GPU inventory that they're using it to train ML models that they acquired, or the course of last year or two when they started doing this. That's actually apparently quite a huge number. And that actually goes back to the fact that I was wondering how come NVIDIA chooses to go to the Snowflake conference and announces their partnership with them instead of the Databricks at their own conference. I kind of was a little babbled by that. But now I'm thinking maybe they tried that and because it didn't work out, this is their fallback strategy to acquire some GPU assets. That's another possibility. And of course, the showing the enterprises an easy way to train an LLM, your own models is the way it's going to be going forward because nobody's going to use an existing chat GPT or other models because the inferencing cost could be extremely expensive if you start using them. So if you had to start building your own LLMs what's the best way to go to a company which has a pile of GPU knowledge base that's available that can help you train and then show them how to do it. They're going to execute it based on the numbers but they got, I don't know, we'll have to see. But it looks like a decent acquisition on the surface. Well, and again, Slutman addressed this when he was asked about Neva. He's like, look, it's really hard to get the talent. Then the second point there, George, you saw this is essentially Snowflake's betting on the NVIDIA stack, they're containerizing it and your premise, George, was a hope to sort of leapfrog the ML AI tool chain that Databricks is so famous for, Leapfrog, which is unsupervised learning and try to leapfrog it with supervised learning. Although from what I saw from the Snowflake, sorry, the Databricks event, I mean, I think Databricks is doing that to themselves. They're trying to leapfrog themselves. I think you're right, Dave, that Snowflake saw this as a chance to reset as they called it, as Christian Kleinerman actually said on the CUBE interview with us and with NVIDIA. And I think what's happening is Databricks did a really hard pivot sometime in the last six months. And then the Mosaic ML tools really are, as Ali said, it just works. You fill out a configuration file and it trains. Now, not everyone is gonna wanna train their own model, but what was stunning was to hear Patrick Wendell say that using Databricks existing tools, already 1,500 customers had trained LLMs on the Databricks platform in the last 12 months, I think it was, and their consumption of GPUs is growing 25% month over month. So there is a demand out there to build specialized models and Databricks is trying to say, hey, between what we've built so far and what we're buying, we're gonna be the most accessible and easy to use place to do this training and fine tuning. That was impressive. Yeah, okay. One of the big things that Databricks focused on was sort of reimagining the database. Instead of going after Snowflake, they didn't say that, they didn't have to, but they had a Reynolds Chin up on stage who was a co-founder and database expert talking about how they're thinking, they being Databricks, thinking about reinventing the data warehouse. And of course, they say data warehouse all the time. This is why Snowflake doesn't like the term data warehouse. So you see that little marketing tit for tat. But it basically talking about, Reynolds was talking about, eliminating the trade-offs between query optimization, i.e. performance, cost and simplicity. And his argument was that Databricks has figured out how to give you all three with no trade-offs. And he mentioned some companies search optimization service. Of course, he was talking about Snowflake, although he didn't mention them by name and how they were expensive. That service was expensive, citing the trade-off right there and other deficiencies. Now, the other thing that Databricks stresses is its openness, of course. That's always been their strong card. And I'd like to have you guys listen to Databricks co-founder Matej Zaharia on theCUBE with John Furrier addressing this topic. And then we're going to come back and talk about it, please play the clip. One of the big things we've always bet on is basically open interfaces. So that means open storage format. So you can use any computing engine and platform with it, open APIs like Apache Spark and MLflow and so on. Because we think that will give customers a lot more choice and ultimately lead to a better architecture for their company. That's going to last for decades as they build out these applications. So we're doing everything in that way where if some new thing comes around that's better at ML training than we are or better at SQL analytics or whatever, you can actually connect it to your data. You don't have to re-platform your whole enterprise, maybe lose out some capabilities you like from Databricks in order to get this other thing. And you don't have to copy data back and forth and generate zillions of dollars of data movement. Yeah, great. Okay, so those are compelling statements that resonate. There's two sides to this. You've got the open standards and you've got the de facto standards. And history would actually show that when you think about like an Oracle, they were able to deliver unique value. At the same time, it's a new world. And to Matei's point, things change so fast you want to be able to plug them in. So Rob, on sort of those two fronts that we just talked about, what is your take? Yeah, I'm on record multiple times saying don't bet against open and open source in particular. And I think where they went with ML flow, well Delta Lake is open source which is that data warehousing technology. The Lake House is really based on Apache Spark and ML flow. They actually had two additional open source contributions that they were bringing to that and open sourcing during that whole thing. So I think that was during that open source, Ronald went into a whole bunch on open source. And I think, again, it's putting that barrier up for Snowflake saying your closed source, old school, Oracle-like, we're new source, open source. And we're really trying to put that moat, using it as a moat, I guess you could say, against Snowflake. And now George, of course, Snowflake would retort, well, we're giving you access to open formats, all data types, many, many different query options. Is what's your take on this, George? I think it was interesting to see like how Delta Lake was downloaded, something like, I think 500 million times and maybe Spark, something like a billion. But the big advantage is Databricks isn't just saying, look, we're open source, they're saying, look, we're gonna run it for you. That's the value of the Databricks service. They're running this software for people. And the openness is the format now, the uniform format where it's both Delta Tables, all three Delta Tables, Iceberg and Hoody. That is a big advantage because they're saying, put your data in any of these formats and you can run any compute engine on it. That is more open than Snowflake right now, which is basically just supporting Iceberg tables. And they're either managed by Snowflake or unmanaged, which means Snowflake can access them. It's not a huge difference, but Databricks is still drafting on the coattails of their original open source routes. Yeah, and I got the stats here. So it says Spark is a billion downloads per year, Delta Lake, half a million downloads per year, and then MLflow, 120 million downloads per year. So pretty substantial numbers there. Okay, I want to add some context here and bring in some ETR data. ETR began tracking Databricks entry into the data warehouse market, Delta Lake, Rob, you just mentioned. Only recently, let me unpack this chart. This shows the granularity of ETR's net score so that lime green is new customers. This is the percent of customers that are adding new, that's the lime green, spending more, that's 54% in that far right, spending flat plus or minus 5%, and then that little pink, which is tiny, is down 5% or more, and then the red is churned. So there's virtually no churn since they started tracking this. And you subtract the reds from the greens and you get net score, which is that blue line, and you can see it's popping up. And then that yellow line down the bottom is presence or provision in the data set. So it's like an indicator of market share or presence inside the data set. It's not overall market, but it's an indicator. And then if you now go on to the next chart, if you compare that with the snowflake data inside of the ETR data set, here's what it looks like. So much more history, which underscores the maturity that we were talking about in terms of the data management platforms. And if you compare the reverse, i.e. Databricks take their strong suit, which is MLAI, with that of snowflake, you'd see the reverse, where Databricks has much, much more history. In fact, it's even more favorable probably for Databricks as snowflake is really just entering that space. So guys, let's talk about the question that everybody debates. Is it harder for snowflake to encroach on Databricks territory or Databricks to get into database and data management? Or is the market so big it doesn't even matter, Rob? Yeah, I mean, I think the market is so big it really doesn't matter. I think there's so much to steal from other people, besides just stealing from each other. And I think we've left out, and I think on one of George's charts he has, with the uniform and with some of the Delta Lake or Delta sharing, things like BigQuery. BigQuery is still out there and used a lot for a lot of analytical workloads that are out there. There's still a lot of Oracle out in the world. Both of these have not gone away yet, and there's still other things out there that are gonna play to both the strengths of these two competitors. And I think they can go after those folks as well. And I think they are, especially when you go and look at everybody going after the 10 or so different databases that Amazon has. None of them have bubbled up to be this kind of competitor. Unlike BigQuery, and I don't even know, I think Microsoft's on their fourth version of an analytical version of this now with their newest one out there. So I think, again, you start to look at it. There's a lot for them to go in. The TAM is just massive still. So George, you went into the Databricks event, kind of with the attitude of saying that Databricks are largely middleware, and unless they have a database or data management platform and really can get heavily into that BI space, it's gonna be harder for them to justify their market cap, which we think is now down around low 20s. Did you change your mind at all? Yes, and if I had a hat and a scissors and something palatable, I'd cut it up and try and eat it. But what I would say, and this is I may be dating myself, this is, they started out, Databricks started out like Informatica. This was a data engineering pipeline that took raw data and created sort of normalized tables that were ready for analytics. And then many shops had both Databricks and Snowflake and they would serve the analytics to BI users who wanted dashboards, highly concurrent and interactive dashboards. What they're doing now to mix metaphors a little bit, they're taking what in on-prem world Informatica did well with pipelines and they're adding the application server that WebLogic made famous. So they're doing both the data engineering and the application platform. And they're not yet a database management system that does operational and analytic data. But by owning those two pieces, they're trying to harness the data engineers and the data scientists and then build these analytic apps on this layer that's a combination of Unity and Lakehouse IQ. And that's essentially what Palantir has done. They're becoming or trying to become an application platform and then Snowflake is when I call them Oracle, I mean it in a complimentary terms. They are a dominant DBMS that unifies all data types behind one query and transaction manager. So they're two very different strategies. Yeah, Nandy, I'd love your thoughts on the market and the opportunity for each of these. It feels like the TAM is quite enormous. And then LLMs and generative AI just blow the top off the thing. Yeah, so the TAM is big because again, the potential out of the possible everybody's talking about is huge, right? I'm actually working on an article on this friend which should be published on in the AI world war as they call it. The alliances are drawn, right? Predominantly the model creation on the data are three horsemen race, which the third one we didn't even talk about to discuss that much today, which is the MongoDB with its own vector search and the whole nine yards and the announcement what they're making. That's one horsemen. And the second would be the Snowflake which acquired Streamlet and NVIDIA now and with their native application framework. So they're going out to the same market and then of course you've got a the Databricks combo that they announced and all three of them are trying to go after saying that you know what, we have the best possible way for you to create whether it's a LLM or small language models or mission learning models and get into production. So that's their overall view of these three horsemen. But then there's also the other players that are in the market that they're trying to partner with them to see how they can get peace of the market too. Good example of that would be AWS providing a Lego components as you and I talk about all the time whether it's a hugging phase bringing models in or doing some other way to do that. And then HP takes a totally different angle on this. They're like, you know what, forget about all of that. We will give you the high density, super density as they call it, super computer service. All you got to do is that, you know you tell me what the model is and then throw things at us. We'll figure out a way to do that. You don't have to go all the way to configure to the lower possible level. That becomes an issue only if you're talking about you know, limited GPUs, limited availability, limited configuration, whatever. Don't worry about any of that. Just give it to us, we'll take care of it for you and then it'll be not cheaper, more compliant and sustainable, whatever would have you. So everybody has their own story. They're all telling their story, their own version of the story. Which one is going to win? It's going hard to predict right now but they all have their own different versions of the story and the time is so big right now the way that people are looking at it. It's, you know, everybody has got an opportunity. You know, it's going to take maybe a year or two to figure out who's the real winner in this but they all can score some sizable market. Yeah and I think the market has grown so much. You know, the tech industry used to be a winner take all or certainly winner take most. Not really the case in cloud, although recent leak data suggests that Azure maybe is not as big as we thought but it's still freaking enormous as is GCP even though they're a distant third. But I'm glad you guys brought up the cloud players because they've all got their own sort of database strategies, big query, really solid Amazon. You know, Amazon is going to stitch together. It's all those different databases and try to minimize the ETLing and all those, by the way, in the ETR data are above that magic 40% mark, the cloud players, Snowflake and Databricks all, you know, either at or well above the 40% net score mark which is an indicator of momentum which speaks a lot to the, again, size of the market and the momentum these guys have. And then Mongo was just below but is making some moves and has had a lot of momentum lately. I want to wrap with the five big themes and get you guys to think about, have a think while I'm chatting here on your bottom line. Yeah, the big five themes as it relates to the expanding Databricks universe. Last year we talked about the expanding Snowflake universe and you can see both universes big, right? So broadening access to analytics was big theme, major investments of course in generative AI with the Mosaic ML announcement. AI apps with a lot of optionality, they really stressed different options. Unified governance is a huge thing. George, I'd love your thoughts on that in terms of, you know, where it stands relative to Snowflake's promise because that really is the fundamental strength of Snowflake. This single platform strategy we're hearing from both companies and then again, this expanding ecosystem. We didn't talk much about it. I didn't physically see the ecosystem at Databricks Summit. The ecosystem at Snowflake was awesome. It was so big they had to go into two halls. George, I'll start with you. Your premise going into the event again was, Databricks was largely selling middleware. You said that you sort of, would you eat your hat on that premise? So the story's evolving into data management and a fairly big well, what's your bottom line? These are the new data, these are the new application platforms because now that we're building applications that program the real world, the API is a company's data assets. So that's why the cloud vendors care about data platforms and the two leading ones now are Databricks and Snowflake. But there's an issue that came up before about people moving their pipelines off Snowflake whether it's perception or reality, the very vendors that defined the modern data stack, 5Tran, DBT, and then companies building on DBT all told me they have a lot of movement off of Snowflake for the core data engineering pipeline. And whether it's because the Snowflake engine was optimized for interactive and concurrent queries as opposed to batch workloads, I don't know, but there is that perception. The reason I bring that up is that breaks the unified governance story that Snowflake offers. Databricks offers unified governance for heterogeneous data assets. That's the genius of unity and that's what makes it so strategic. They're trying to say, will govern assets wherever they are? So these are the two platforms for building data applications of the future. And I will say that the energy behind the generative AI enthusiasm was captured more by Databricks than by Snowflake and even more by Databricks than by Microsoft when I was at the build conference. Yeah, and while Snowflake, I think had a really good strong story about building apps on its platform, the number of apps that we saw were limited. So they got a ways to go. I think it was maybe a couple of dozen, but they're off to a good start, but we've got to see more there. And they're bringing together their dev conference and Snowflake Summit next year, so we'll be able to track the progress there, Rob. What's your bottom line? Yeah, I agree with everything that George said, and I would even go a little bit further in saying that Databricks is a little bit behind on their ecosystem. And to your point, I think that there was a vibrant ecosystem very fervent. It's almost the same people who were at the Snowflake. Summit as well. So I think there's heavy overlap between the two. Snowflake announced Snowpipe, which is supposed to be kind of a battle about getting them from being seen as operational to being more of that real-time data ingest. And to me, again, it doesn't work, it doesn't not work. Other people have worked around it already and built their own homegrown solutions. I think Databricks still has a lead in that real-time personalization that George brought up with 5Tran and taking that first-party data and being able to do something with it in a data product. I think Databricks is talking to the data engineering and above people, whereas you still have a lot of Snowflake talking to this. The word sequel was used in an ordinate amount of times in the keynotes that I saw. And I think that's not what people are working in. And I think that plays to Databricks' strengths going forward. I think their marketplace, their apps and their ecosystem is less mature, but I think they definitely have a huge amount of momentum behind them. And I think it was really that momentum was on stage at the summit this year. All right, thank you. All right, Andy, you bring us home. What's your bottom line? Okay, so my view is that in a couple of areas, Databricks has a slight leg up. For example, things like data lineage, we talked about model lineage, model creation tracking. Things like when you're creating models, the data visibility of who can create what models and where it can be visible. They got a slight advantage over the other two. And then also, how do you productionalize all of these models, whether it's the mission learning models, ML ops capabilities, or even when you create LLMs, the LLM ops capabilities, that's a huge market LLM ops now. And the other two guys are not thinking about it, whereas Databricks is slightly ahead of it. So, among the comparable capabilities of model creation, model maintenance, model management, data governance, I would say Databricks has a slight advantage over that. But there's another thing that we didn't even discuss about. It's about having a visibility, inferencing, privacy, and all that when you're deploying the models, when it comes to, can you deploy the same model in EU? And if you get the data, what kind of a regulation you'll have in there, that operationalization portion of that, none of them have it now. I call it as unknown unknowns. All these enterprises, when they hit to that stage, they're going to look forward to say, you know what, I need to build these things. Who's going to provide me the most? Right now, nobody's thinking that far ahead because not many LLMs have gone to production phase these issues. When they come out, then probably they'll regroup and say that how we can do this better. So that's going to come up and it's going to figure out on how, who can adapt faster and provide those solutions as we move along at this fast pace. So my bottom line conclusion is among this race, I cautiously declared Databricks winner compared to the other two, but that has to be seen. Well, I have to say the keynotes which sort of breaks from convention post COVID were almost three hours, but they held our attention because they were so meaty and so strong. Dot Ali Godsi was really quite an awesome moderator and participated throughout. I also really liked the way that they showcased the founders. I think they had four founders on stage. And I think Ali and Matei are always front and center, but they brought in the other founders who are made of, I don't know what the history is of those other guys, but anyway, they had all four on stage. I thought Snowflake really under stated the founder presence. Now Thierry rarely goes in front of folks, right? But Benoit usually is front and center and he was there, but I didn't think he got enough time. I would have liked to see more of that. So the fact that they were able to hold our attention for so long was pretty compelling. The last thing I'll say is I'm kind of waiting for this IPO so we can actually unpack some of this stuff. They've been giving us hints. You would have thought given the tech run up in the first half of this year, they would have taken advantage of it. I think it's penguin on the iceberg. They don't want to be first and now people are sort of tapping the brakes so we'll see what happens there. But I want to thank you guys for helping us in our audience really understand, connect the dots. There's a lot here and it's a complicated topic. And I think there are a lot of customers out there trying to squint through it and your expertise is really greatly appreciated. So thank you. All right, I also want to thank Alex Meyerson who's on production and manages the podcast as well as Ken Schiffman, Kristen Martin and Cheryl Knight helped get the word out on social media and in our newsletters and Rob Hofers our editor-in-chief over at siliconangle.com. He does some great editing. Remember all these episodes are available as podcasts wherever you listen, just search breaking analysis podcast and appreciate you subscribing. I publish each week on wikibon.com and siliconangle.com. If you want to get in touch with email david.volante at siliconangle.com you can DM me at dvolante or comment on our LinkedIn posts. Check out etr.ai. They got great survey data, the best in the enterprise tech business. This is Dave Vellante for theCUBE Insights powered by ETR. Thanks for watching everybody and we'll see you next time on breaking analysis.