 Hey guys and girls, good morning. It's theCUBE, the leader in live tech coverage. This is the beginning of day two of our coverage of Snowflake Summit. Lisa Martin here with Dave Vellante, George Gilbert. They're going to be connecting the dots of Snowflake's roadmap. They did a great breaking analysis piece last week. If you haven't seen it, we're going to be double clicking into that now. Dave, George, good morning. Good to see you guys. Yeah, it's quiet in here because everybody's in the keynote. Everyone's still in the keynote. We're going to be talking about the future of enterprise data apps. Dave, kick us off. What's the overall summary of the breaking analysis that you and George just published? Well, George and I started, we've collaborated for many years, but we really started upping our game in the last several months. And we started with taking a deep dive into data bricks and looking at their stack and some of the potential disruption scenarios there. And then we took that to the next level with Snowflake. We did a lot of research and tried to better understand what to expect here. Without any hints, we had no NDAs from Snowflake, but we did have a lot of discussions with customers and other executives and technologists. And then we also went way ahead of the future with Uber. We had Uday, Kieran, Medesedion, to look at the future of data apps as the combination of people, places and things and the digital representation of your business, the digital twin, if you will, and how do you turn that into something that the database would understand in real time at scale. And so we sort of trying to lay out what the landscape looks like. And I guess I would say this, George, this notion at the high level of all data, all workloads is kind of the high level messaging that you get from the likes of Frank Slutman. You know, very good, very strong, very powerful. But when you talk to the technical people, you go deep, they really are solid. They give you a sense as to how they did this, built this platform. The challenge that we've had, I think that many people have is when you listen to their product announcements, it's a very stovepipe. It's ironic that they're trying to break down all these silos, these data silos, but their middle messaging, their product messaging is a very, very stovepipe. And it's sometimes hard to connect the dots. It's like, on the one hand, it's great that we get the fire hose of announcements from Christian Kleinerman, but they don't do a great job of describing their real advantage, George, which is the integration. So maybe you could take a stab at that. So you teed this up perfectly, because if I wanted to take a before and after image, the before image would be like Amazon Web Services when Werner Vogels got up 18 months ago at Reinvent and he puts up this slide of all 200 Amazon services and he's like, you know, you guys are telling us this is complicated, but it's your fault. You asked for all this choice and power. And what Snowflake is doing is taking that slide and integrating all those capabilities so you're not trying to stuff together a bunch of piece parts that didn't fit. Now to be specific, you know, Benoit started out by saying, look, from day one we built this data flow engine. It wasn't a SQL data warehouse. It was a data flow engine that could have multiple personalities to talk to it and that it in turn could talk to multiple data types. And that in a nutshell is their huge value add because no one else really has that. And just to elaborate really quickly, so they started with SQL, then they added data frames. So all Python programmers, which now may be more numerous than SQL programmers, they can access the same capabilities. Then they're teasing out search. This was the Niva acquisition so that you can talk in natural language and underneath it can generate a SQL query. It can also generate a query to talk to documents and then pull the query result together and integrate it. Then there is another way of talking to the data is the traditional like scikit-learn, the old supervised machine learning libraries. But then there's another one which they showed with NVIDIA where with generative AI and some packaged machine learning models that NVIDIA has. Which would be NLP. Yes, the NLP, but also stuff like recommenders so that out of the box you can do really, really advanced models. So let me stop you for a second. Because this is complicated for a lot of people, me included. So you have many ways to query the data. You've got SQL, which is classic, that's kind of where they started. You said data frames, explain what data frames are. Data frames are more forgiving than SQL. The data frame is the way Python programmers they talk to their data to find out what's in there and to explore it and to clean it. It's more forgiving, but data frame can generate SQL, but it's a different interface. Okay, and then NIVA gives you search and the ability to search documents, they showed a lot of that today. But it's not just documents, but NIVA can then generate a SQL query. So you can talk in natural language, but it can outcome come a precise SQL query. So in natural language, out in language that a database will understand like SQL, which is very flexible. And then supervised machine learning libraries, which is, correct me if I'm wrong, but that's kind of Databricks strength, right? Right, and Databricks was built to a whole tool chain and language for supervised machine learning. To clean the data, to do the feature engineering, to train the models, to serve the models. And then what we learned from NVIDIA, talking to Christian and then the head of enterprise at NVIDIA was that basically this is a whole reset because we're doing a generational transition from supervised machine learning, the traditional way to generative AI as unsupervised or trains itself basically. Okay, so you got many ways to query. Now let's talk about the many data types. What you I think referred to as pluggable storage. Explain what that means. And Snowflake will talk in terms of unstructured, semi-structured and structured data. But the technology beneath that, I mean everybody understands that, which is good that they communicate at that level. But to make it work, there's specifics around different database types, different storage types. Explain the various options that exist there that they've supported. Even with structured data, when you want to do analytics, slice and dice, you store it one way in columnar format. But when you want to do transactions and they're, I guess, close to shipping that one, Unistore. Unistore. We're waiting for Unistore. Waiting for Godot and Unistore. So you store that in rows, but it's a very different database beast. It's not just a reorganization. So that's one, those are two pluggable storage types. So great, sorry. Rows and columns. Rows and columns. And analytics and OLAP. OLAP and transactions. And that's no mean feat. But to that they're adding, they can now pull in streaming data and land it in a table. And then they can join across these different table types with dynamic tables, which could be updated from a stream and updated from reference data. That's very powerful. And then they're adding vector data, which is when you bring in documents, you want to shred them and you want to turn them into a format that's machine readable. And a vector database example would be Pinecone, right? That's something that they... That's a vector database. And then they can also do vector search. Vector database is more full-featured. So yes, that's another version. And then there's graph data, which is like when you want to do customer 360 or security or supply chain, you want all the links in the data preserved. You don't want to flatten it into a table. Relational AI would be an example of a hybrid graph. But it's a graph, but it just can do join. So that means that it's more expressive than you get with the traditional relational. But the problem with graph databases is you don't have the query flexibility and that's what relational AI is trying to solve. Right, exactly. Okay, so we've got OLAP, we've got OLTP slash transaction data streaming. It sounds like they're actually creating another data type when they do these joins with dynamic table. That's like sort of a... Well, the result is one relational table, but it's pulling from different table types. Okay, so it ends up in relational form, but the magic is that they're able to pull from those different types than vector, than graph. Okay, so you've got many ways to query, you've got many data types. Okay, setup question. Well, so does Amazon. That is a setup question. Okay, so go back to Werner's slide with the 200 services. Now just take out the 12 operational databases, not that analytic databases. There's like 12 operational databases. And each one has a programming model. Each one has its own data format. In fact, the data format is bolted inside the database. Here, the data formats are increasingly open so that there are multiple ways to get at them. And then, so there you would have to extract, reformat and import to move between just the operational databases. Then you've got the analytic databases. They don't even tie the data lake and data warehouse together all that well. The one thing I will say though here that there's one wrinkle in the story and it's not a technical wrinkle. It's that if you want to put all your data in the data cloud and then operate on it through their engine, there is a compute cost. There's overhead and with their markup of the infrastructure some customers are finding that expensive. So let's explain this. So Snowflake, like very Amazon like, they want you to put all the data into the Snowflake data cloud and the reason is and the value for the customers if you do that, Snowflake can promise the governance and the security and all the irrespective of which partner is there. As long as they're conforming to that Snowflake standard just like the app store, that's the analogy. The problem you're saying is that if you do that and you do the compute and all the cleansing and the data engineering inside of Snowflake it gets really, really expensive. Why would it be cheaper to do it elsewhere? Well, where else would you do it that would be cheaper? Would you do it on-prem? Would you do it in another cloud? You could do it on-prem or I think what's happening is people are doing it on S3 and they might have some special purpose tool that essentially does the data ETL or ELT in S3 because then it's just cheaper. Data prep in S3 and this is why you still need folks like Informatica. Right. Okay, or other ETL vendors of Matillion is another example of some folks we had on before. Okay, so that's the one little net and you've talked to some customers that have said and even some partners that say, well, this is the reality is we have to be really careful especially in this day and age. Okay, so where's that leave us? Let me say it this way. In my mind anyway, George, let's call it four. Let's call it five, major platform players. You got three that are dominant in machine learning and AI, you've got AWS, Google, and now Microsoft given that they did the reach around with open AI, you know, brilliant business move. So those three obviously very, very popular and they all have data platforms. And then you've got Snowflake and Databricks. So how do you see the horses on the track? Okay, so this is where it really did get interesting in the last six to nine months where Databricks was pretty much far and away the leader in the tool chain for supervised machine learning. They built the whole company around that. And Amazon had a pretty good tool chain with SageMaker but it was like Amazon, it's kind of disjointed but the functionality of each piece was pretty good. Google had a very good coherent product line. Microsoft, this was weird. Microsoft stopped adding functionality to their machine learning tool set like two or three years ago. Because you suspect, right? I thought something was odd because it's too strategic for them to give up. So what they were doing was, they knew two or three years ago they were going to get serious about open AI and generative AI. And they said in true Microsoft fashion, look, we lost this generation. Let's get a head start on the next generation. So it's not just that they did a deal with open AI, it's but that they built the new tool chain to support that. Okay, and so, oh. Please carry through. Okay, so the data bricks, I looked at the Mosaic ML acquisition. When I read it, I was actually stunned because they didn't buy like a tool chain that corporates would really use to help build generative AI models. That's like any scale system software almost. It's one level above that or it's like a really thin piece of operating system software that helps you run these really large, long running jobs to train models. Corporates don't do that. They fine tune models. Corporates don't train models from scratch. So then that says that the data bricks is targeting a different audience then with this acquisition. Either that or they needed it so that they could train their own and then offer these to be fine tuned by their customers. Training these from scratch is going to be done by like Bloomberg and maybe a couple dozen companies in the Department of Defense. That's not a mainstream activity. You're going to take pre-trained ones and tune them and so data bricks needs tools for that. Now in fairness, so you're actually headed to the data bricks conference. So you know, look, when you're at these conferences, like my friend Andy Turis says, these companies are really good at telling stories. And so you have the bias, the recency bias. I'm really curious as to talk to you on Friday and see what you think about what you learned at data bricks. Do you have a question? Yeah, well you guys in the breaking analysis, that's on siliconangle.com, you talked about the presence of data bricks in Snowflake accounts in vice versa. Talk about maybe the synergies there that some of you did a great compare and contrast but obviously Snowflake customers are using data bricks in vice versa. So prior to the battle between these two companies I'll just say, the new workload that was emerging was you had AWS cloud infrastructure and you had Snowflake data warehouse at the time, simplified data warehouse and you had data bricks machine learning and AI coming together as a new emergent workload. This is probably 2015, 16, 17 in that timeframe. And we thought at the time that that was sort of this new ecosystem forming but then both data bricks and Snowflake raised boatloads of money and realized the TAM is a lot larger. And so now what you have is Snowflake with very strong data management, simplified data cloud, data warehouse moving up to applications, a super cloud, et cetera, and trying to get into machine learning and data engineering, data science. You've heard a lot of announcements here to that respect. At the same time you have data bricks from that ML AI at data engineering heritage getting into the traditional space of analytics and data warehouse with Lake House and there are other capabilities. So the overlap is very high in those accounts. It's higher, there are more Snowflake accounts inside data bricks accounts but that's because there's more Snowflake accounts out there than the reverse. But in both cases very, very high overlap, 30, 40% type of overlap. So I would add to that that they're not complete substitutes yet. That the Python functionality that Snowflake has shown still needs to mature like where we got a heads up that they might be doing something along the lines of what Ponder has done with a complete pandas API running on Snowflake. As soon as that's available as an Anaconda library, any customer can use that. But right now Snowpark is fully featured for doing like the data engineering pipelines as data bricks. And the other thing is that because data bricks runs on a data lake, the compute cost and the compute overhead of doing your data engineering, the pipelines, the ET-LELT, it's cheaper still on data bricks which is why you might still see overlap. Ah, to your earlier point, if you do it inside of Snowflake, it gets more expensive. So data bricks, very interesting company. But we talked about the key elements of their stack in the piece that we wrote. You sort of helped us frame that. The Delta Lake, the Spark Execution Engine, the Photon, which is the BI warehouse and the AI ML2 chain, which is their wheelhouse. And the core capability, some of the strengths and weaknesses and the threats to each of those. The conclusion was there are a lot of disruptions potentially coming at them. So they've got a lot of critical decisions to make. Now, I actually have a lot of confidence that their team is very good and they'll make whatever acquisitions they need to make and they'll evolve that. They've got a lot of resources. They've got a lot of loyal customers. But this sort of sets up this really interesting battle. But I want to bring in AWS and Google. Because I'll point out that AWS for years has sort of copied some of Snowflake's move. They separated compute from storage as an example, which really was kind of a bolt-on, if you recall. And so you really can't shut down, for instance, the compute, you know, you kind of dial it down. And they did that by tiering, you know, to tiering the less active storage. But it was again, I mean, it was based upon, you know, an on-prem stack that they licensed perpetually. You're not perpetually, they basically bought it out. What was the company they- PowerXL. PowerXL, right. And then, you know, Google, on the other hand, has a cloud-native data platform with BigQuery. You know, Google, the thing about Google, I love your thoughts on both of those companies. They want to keep the others out. They want to kind of force customers to use their ML. If you want Google AI, you kind of got to use BigQuery. Right, so they're more, so Snowflake will tell you, they're more expensive in GCP. And that's, I think, by design. It's like, you know, Oracle is more expensive in Amazon. If you're going to be on GCP, you're very well-served using BigQuery and the Vertex whole AI tool chain. In fact, those are the main reasons to use GCP. That's it. Yeah, so that's not, that's why I don't think Snowflake has a ton of momentum in GCP. They do have momentum in Microsoft. Explain why, why would Microsoft be a better fit? Because Microsoft, well, for one, GCP doesn't have much presence in the enterprise. They just, they never had a big footprint. They're a consumer company for the most part there. Even with SQL Server? No, I'm talking about GCP. Oh, yeah, yeah, yeah, yeah, yeah, sorry. And then with Microsoft, their data platform in Azure was always fragmented. And this is the first time with Fabric, and Synapse as the core engine, that they at least standardized the table format. They standardized on delta tables. And the reason they did that is because Databricks had something like 40% of all VMs running on Azure were Databricks, meaning all the data on Azure was belonging to Databricks. So Microsoft is trying to co-opt that data and say, use our analytic engines. And we'll partner with Databricks. So between the two of us, we'll try and get everyone into delta tables and then we'll compete on the quality of our analytic engines. So that's their play, but neither of them have this core engine that Snowflake has, which is we can handle all your analytic types and all your operational workloads. That's an integration level where Snowflake has gone one level above, not just the data format, but now the engines. So Monday night, Frank Slutman interviewed Jensen Wong from NVIDIA, and the next day the stock popped, went up 3.5, 4%. I don't know, I wasn't at Investor's Day and I think it was yesterday. So whatever Frank and Mike Scarpelli said, impressed investors, it's up again, another seven points today, the stock's up 4%. So they liked what they heard, even though the ETR data shows a smaller percentage of new logos currently, a higher percentage of those customers that are spending flat and even a higher percentage of those customers spending less or defecting, their churn is still very, very small, but the red is growing, the gray, which is flat-spending- The red is churn. Is churn and or spend less. The gray, which is spending flat, is growing pretty noticeably. The green, which is spending more, is declining, it's compressing, and the bright green, which is new logos, is definitely, is down to like 15%, whereas it was much higher. Now, I know just in listening and observing these guys, they don't incentivize this, and I could ask Frank, look at this, they don't incentivize their sales teams on new logos. They incentivize them on finding companies, their incentive is essentially consumption, right? They have a consumption model if they can incentivize their customers, their reps and their customers to consume, that's a goodness, which I don't know if that's a blind spot or not. I mean, I think my feeling, George, is when the economy picks up again and people are just more comfortable that organically they'll get more new logos. However, to your point, there's competition. But the competition now is within an account, like maybe we go back to the old model where you used something like Hadoop to do the pipeline and get the data ready, and then you used the data warehouse or now the data platform for doing the interactive query or now applications, but there's a big business in the data pipeline, and that's where they're not really cost competitive. Okay, now come back to roadmap. So connect the dots. So we have this, we have many ways. We have this all data, all workloads platform. You have many data types, many ways to query. The magic is it's all integrated. So where does it go from here? Okay, so this was where the analyst day was really helpful, especially when we got to talk to some of the product guys, whether it was Benoit or Christian or Chris Child. And right now they're consolidating really their hold on all the data management functionality. They're sort of reclaiming the glory days of Oracle, where it's managed not only all your operational data, but all your analytic data. Of course they wrestle at that example, but it's true. But Oracle, I refer to that in a good sense. In a dominant, they won the war. But what Oracle never did was they never got to the next level, which was to be the platform for applications. So they just bought the applications and ran them on their own platform. But before that, so what SAP and PeopleSoft did was they built their own application servers and they put the application logic in that layer. And BEA came along and you put your application logic in that layer for web apps. What Snowflake doesn't want to let happen is an application platform to emerge above them. And they know this issue. They're taking care of the data first, but they realize that to the issue we've been talking about, they need this semantic layer. They need a workflow model that's really rich because then they know how to translate from what a developer talks about what you started with, which is the people places things and the database knows about strings and it's the platform engine that optimizes the difference and translates between things and strings. So those people places things that digital twin of your business, they could be represented by many, many data types. Unstructured rows, columns, documents, vector data, knowledge graphs, et cetera. But that translation. And then the semantic layer creates that translation so that all those data elements are coherent. And the owner of that semantic layer is going to be a very powerful lever. If it's inside a snowflake, then they're going to maintain control. If it's not inside a snowflake and it's a third party, then that data is going to be accessible to a lot of other folks. Now, it won't necessarily have the governance and security promise that snowflake delivers, but it will be available in a coherent fashion. And snowflake doesn't want to let that happen, I would presume. You could implement the governance at the application layer. And this is what relational AI, this was- Which would create more stovepipes. Well, no, because of the semantic layer would- They would be the unifying layer. They would be the unifying layer. Okay, so one of the comments you've made to me is that, and I think evolved your thinking on this, is your inference was that snowflake are thinking like database people, which of course, they came out of Oracle, the other database, but as you speak, as we speak to more people, it's obvious that they're bringing in talent that has more of an application mindset, not sort of a database mindset. Is that a fair assertion? I think that the mindset to date has been, let's get all the data and manage it. And now it's like, okay, so what do we need to do next? Wow, that was a master class in breaking down the breaking analysis. Published on siliconangle.com, just search for snowflake summit breaking analysis. You'll find it at the top link. Dave, you've got Frank Slutman, we've got a great lineup of snowflake executives, customers, partners, we're going to be digging into all sorts of things today. Dave, what are you looking forward to with your one-on-one with Frank Slutman who's coming on in about an hour? I want to know what happened to investor day, what he said. I can't believe he said anything new, but an investor day, and it's available. It's all public information, you can watch it, but they get open. They talk a little bit about the competition, and they talk about why, they feel like they've got the right point of view, so I want to understand what's happened there, and I want to test this thesis a little bit that you've got the top level messaging, which is right on, you've got the technical foundation, which is extremely solid, and then sort of what's missing is that middle piece, and make sure that we are understanding this correctly. I want to get Frank's point of view on that. All right, that sounds great. We want to thank you for watching our Connect the Dots of Snowflakes Roadmap. Up next, we have a full day of coverage on theCUBE for you, as I mentioned. We've got Amanda Kelly from Streamlet coming on with customer power school. They're going to be talking about how Streamlet is helping power school to really transform the K-12 experience from the front office to the classroom to the home. As I mentioned, the breaking analysis is published on siliconangle.com as is all of our analysis and editorial content. And of course, you can find all of our CUBE content from Snowflake and other events on cube.net. We'll see you in a few minutes with our next guest. You're watching theCUBE, the leader in live tech event coverage.