 From theCUBE Studios in Palo Alto in Boston, bringing you data-driven insights from theCUBE and ETR. This is Breaking Analysis with Dave Vellante. Recent earnings prints from Amazon and Snowflake, along with new survey data, have provided additional context on top of the two events that Snowflake and Databricks each hosted last June. Specifically, we believe that the effects of a cloud optimization are still being felt, but are nearing the end of peak negative impact on cloud companies. Snowflake's recent renewal with Microsoft on its contract better aligns sales incentives and should improve the company's traction, Snowflake's traction that is with Microsoft Azure, a platform that has long-favored Databricks. Google, however, remains a different story as their agenda is to build out its own data cloud stack rather than supporting Snowflake's aspirations. Hello and welcome to this week's Wikibon Cube Insights, powered by ETR. In this Breaking Analysis, George Gilbert and I clarify some of our previous assumptions around Snowflake economics. We'll also dig into the three US-based hyperscale platforms to better understand the footprint that these key data platforms have, Snowflake in particular, in those accounts along with Databricks and ahead of Google Cloud Next, will preview how we believe Google is evolving its cloud and data stacks to compete more effectively in the market. Let's start with some spending data from ETR. Now, let me describe this chart and the colors therein. So ETR has a methodology called Net Score. Net Score measures the following. The lime green is new customer ads. If you go all the way to the right to July 2023, this is the Snowflake spending profile. That's 16% of the customers in the survey said that they were new to Snowflake. 41% that forest green said that they were spending 6% or more on Snowflake in the second half or the remainder of the year. The gray is plus or minus 5%, i.e. flat. The 7% or that pinkish area is spending down 6% or worse. And the red is defecting, leaving the platform is this very small number, but it's there nonetheless. You subtract the reds from the greens and you get a net score and you can see that blue line represents that net score, that spend momentum, and that's been decelerating for the last several quarters. That yellow line down below is a measure of pervasiveness, pervasion. I know it's kind of a funny word in the survey, essentially the number of mentions of Snowflake divided by the total number of mentions. So now our thinking was that the declaration or the deceleration rather in spending momentum on Snowflake was perhaps aligned with what we've been hearing from some of the customers and partners that we've been speaking with. Specifically, the customers have been telling us that they're doing the data engineering in the data prep rather than doing it inside of Snowflake, they were moving it or doing it outside of Snowflake ostensibly in doing that batch work in Spark, like Databricks or Amazon EMR, and the reasoning given to us was economics. But Snowflake shared data at its financial analyst day that contradicts our original premise. Here's a chart that CFO Mike Scarpelli showed at that meeting. Why don't we let him explain the chart? Please play the clip. No Park is taking its share. What this graph is showing you is two Spark technologies that are running within our customer base. I can see that the blue at the bottom is Snowpark. And you can see how now Snowpark consumption, this is looking at daily credits, what they're consuming has now outpacing Spark number one, and it's gonna surpass Spark number two. And so what you're seeing also is those ones we're growing within our customer base, we're growing much faster than them. So in this chart, Spark number one is probably EMR and Spark number two is likely Databricks. Now, George, we met earlier today with Snowflake senior vice president of product, Kristen Kleinerman, and he explained in more detail why Snowpark was more cost effective than Spark for doing this type of work. And he expressed this very strong conviction that Snowflake is at a good position to capture these data prep activities going forward. But George, my takeaway was that if a Snowflake customer has Snowpark, they'll keep that data inside of Snowflake. And by the way, the ETR data shows that of the 277 Snowflake accounts in the data set, nearly half also have Snowpark. And moreover, on its earnings call, Snowflake said that 63% of its global 2000 customers are using Snowpark on a weekly basis. So George, what's your takeaway on this data? I think the first takeaway is they needed Python programmability as an option for people who wanted to do data manipulation in something perhaps more expressive than SQL. They have that option now. The question is, is that rise that we're seeing in the chart a replacement for the ostensibly or perceived lower cost options using AWS EMR or Databricks for batch work? Or is it people using Python instead of SQL? And the company maintains, as Christian told us, the company maintains that they are able to measure a huge cost savings relative to the alternatives who either have to extract the data or in the case of Databricks, they're running both Databricks compute and Snowflake compute, so it's like doubly expensive. I think we don't know the full answer yet whether customers will, with the Snowpark option, be able to consolidate their entire data estate, doing data engineering, as well as the BI dashboard serving all in Snowflake. I think Snowflake has a story now and an offering, but I don't think the dust has settled as to whether they can consolidate all data the way they would like to. Now, we did ask Christian if this was an Apple's comparison. He assured us that it was. In other words, they weren't including the data movement in and out. If you included that, their advantages would have been substantially greater. As well, we've reported that when you look at Snowflake, remember, Snowflake bundles in AWS compute and storage. It charges its customers as part of the Snowflake build. Databricks does not. And so in some cases, people might be looking at the prices and saying, hmm, okay, that's more expensive to do inside of Snowflake. So it could be a perception issue. Also, there might be customers outside that don't have Snowpark or maybe just getting started with Snowflake and have more experience with Spark and they haven't really done this analysis yet. There are a number of case studies that Snowflake shared with us, will include it in the show notes as well. Now, the other thing I want to talk about is Snowflake in global 2000 accounts. On its earnings call, Mike Scarpelli talked about, and Frank Slutman as well, talked about how some of its largest customers essentially were spending at the commit levels that renewals actually were coming in at the commit levels. In other words, they weren't growing dramatically. Now, we dug into the ETR data called my friend Eric Bradley this morning and he whipped up this chart. This is Snowflake spending in the global 2000 only. So it's the same net score granularity with the greens being positive spending momentum and the reds being negative spending momentum. And that blue line is the net score. And you can see where we've highlighted in that red dotted space, basically a bottoming in those global 2000 accounts. Now remember in the previous two charts ago, it showed that continued decelerating, Mike Scarpelli will often make the point, George, that he likes whale hunting, going after those big accounts that can spend a million dollars or more, 10 million dollars or more, or even more. Now, not all the global 2000 accounts are actually those large accounts, but those are the ones that they want to go after, those are the ones that they really want to nurture. Any thoughts on that from you? Yeah, this is where we need to slice and dice the data for granularity, which you showed with the ETR, which is like Snowflake has a very compelling story for accounts that they're onboarding to say, give us all your data. We can handle all the workloads from the data preparation to the final business intelligence, dashboard, interactive workloads. The problem is what happens when a customer becomes very large and they try and put all their data consolidated in Snowflake, will they see that as economic? And again, I'm not sure that we have the data yet to see that. I think the ETR spending survey indicates that there still might be perceived cost advantages to doing the batch workloads, the data preparation outside Snowflake. And the Snowflake when they talked to us this morning, they emphasized that the cost, there was great cost in taking the data out of Snowflake doing the preparation and putting it back in Snowflake. What we never drill down on is what if you never put the prepared data in Snowflake in the first place, you do the preparation outside Snowflake and you only put the final product, the refined dashboards, dashboard ready data into Snowflake. In other words, I think it's a little ambiguous to decide what has finally happened. And the risk is that Snowflake was predicated on, the data platform is predicated on controlling all the data and yet control, the definition of control is shifting from is it in your database to is it governed by your catalog and the Databricks story can govern all your data even in other databases and the Snowflake story is they govern it if it's essentially in their database. When that's the marketing challenge that Snowflake faces which is why I actually was really happy that they shared those case studies because that's one of the questions that we had you have any examples or economic models and so they were able to readily supply them but that's going to be the international war that they're fighting with not only Databricks but others including Amazon. Speaking of which, let's take a look at Snowflake in AWS accounts, this next chart really does sort of a penetration of the data platform within AWS accounts. So if you look at this data, it's the net score on the vertical axis and it's the penetration or overlap, this is inside of AWS accounts. So we filtered the data, you could see up there in the left hand column it says filter vendors we just put in AWS and then the bottom there in that left hand column, you can see filtered end 926. That means there's 926 AWS accounts that we're filtering on. And on the right hand side of the chart you show the position of that spending momentum on the vertical axis and the end if you will on the horizontal axis of the overlap. And you can see in the upper right, we cite both of those positions. The first column in the green is the net score. Some there's some reds down there for IBM and Oracle. And then the right most column is the shared end the number of mentions. And this is that determines the plots. You can see Microsoft is all the way to the right because they're just so ubiquitous. So in other words, there's a lot of Microsoft accounts inside of AWS accounts. No surprise, the world is multi-cloud. And then you can see the snowflake in the Databricks positions Databricks is actually overtaken snowflake in terms of spending momentum. And you can see the overlap. In other words, within those 926 accounts, 24% have snowflake, 16% have Databricks. And you can see Oracle actually is quite prominent as well. And then kind of IBM, you can see them in the chart. Now, so that's inside of AWS accounts. Alex, let's go on to the next chart, the same data. Now we're looking at these data platforms inside of Azure accounts. Actually, Alex, do me a favor, go back one. I wanted to make one other point. Up in that upper right in this AWS accounts, you can see the spending momentum on snowflake is 50.9%. So 51% with a shared end of 224. Now go to the Azure chart and you can see that spending momentum in the upper right is just about the same, 50%. You can see the shared end goes down. So fewer accounts inside of Azure, but the overlap is still pretty significant. And obviously snowflake with its new alignment and incentives wants to make that go higher. You can see as well, Databricks has spending momentum in the Azure accounts. They've always done actually pretty well there. It's interesting to see the presence of snowflake inside of accounts that are running Azure. This is not, I want to be clear here, this is not necessarily snowflake running on Azure. This is more likely snowflake running on AWS. You can see the presence of AWS all the way to the right, just as it was the reverse with Microsoft inside of AWS accounts. This is, so it's snowflake running wherever, not necessarily on Azure. I just want to point that out. Now, again, keep in mind that 50% in the upper right net score. Now go to the next slide, Alex. Now we're talking about Google accounts. And you can see there's still a big presence inside of Google accounts. But notice the net score drops to 38%, okay? And this is likely due to the fact that a lot of those accounts are going to be relying on BigQuery. I'm going to talk about that in a moment. And you can see some pretty substantial overlap with both Databricks and Snowflake inside of Google Cloud platform accounts, those 489 Google Cloud platform accounts. So George, anything you'd add here, I mean the key point really is the world is multi-cloud as you and I talk about all the time. And there's a lot of upside potential for Snowflake inside of Azure, Google probably a little tougher, your take. Yeah, I think Dave, you're hitting on the key thing which is all cloud apps are becoming data apps. And so the data stack is becoming the foundation for the cloud platform, whether it's built by the hyperscaler or whether it's more like a pass the way Snowflake is trying to build it. And so the reason Snowflake is having more trouble in Google accounts is they're the only hyperscaler with a really good data platform story. And that we'll get into in a minute, which is how we expect that to expand at Google Cloud next week. Yeah, so let's take a look at some comments that Mike Scarpelli made about Google. He said, Microsoft and AWS, we have very good partnerships there. Google is the one we still need to work on. We're open, meaning Snowflake's open to that. They're just not as open to it. And George, they're not as open to it is because they really want to drive a big query. So let's pivot to Google and set up Google Cloud next. And let's talk about Google, its approach, its approach to its data stack to actually use the term data cloud. What will data applications look like George on Google Cloud? Take us through this. Okay, so two big points to make. Hyperscalers as part of expanding their market, they're trying to simplify and democratize the ability to build and run apps. And to do that, they're trying to close the gap they're trying to shrink the gap between the complexity and power of an infrastructure as a service and the simplicity but relative restrictiveness of a pass. And then the second point, the second big point is all the cloud apps are becoming data apps. Now, we think that the big unveil at Google Cloud next will be pervasive use of generative AI really as a code generator to help shrink the gap between the complexity of gap between an infrastructure as a service and a pass. We think that all three cloud platforms are doing it. We suspect Microsoft was furthest along because they started showing GitHub Copilot two years ago. And then Google was next most aggressive and then Amazon coming along. Then as far as the cloud apps themselves becoming data apps this is where we're moving from a world where you had users typing into forms to data being automatically collected or instrumented from people, places, things and activity and that drives intelligent data apps. And that's why Snowflake and Databricks are the real competition for the hyperscalers. And the last point to make is just as an introduction to what we think Google will show we think Looker is likely to be the UI that integrates at the presentation layer all these services. Yeah, so the cloud's getting complicated. It used to be so simple, right? We've been up in some EC2 and stored on S3 and now there's just hundreds and hundreds of services services across cloud, shared responsibility and security across cloud. So how do the layers of IaaS map to pass? And is there a simplification opportunity here? Yes, that's the key question. So we talked about Looker as the presentation layer as sort of the output target for all the application services such as analytics, AI apps, databases, governance can they bring that all together in Looker as the presentation layer? And then there's essentially three layers below that which we'll drill down into. And this is the part where can you put the pieces together better to make IaaS look more like paths? And Red Monk was the first analyst firm to point out the potential for Gen AI to start simplifying this. And so that would be the three layers would be can you integrate the app services with each other and can they integrate with governance and semantics? And we'll talk about what that means. Then the hardest one is DevOps simplification because that's the real Achilles heel of an infrastructure as a service. There's so much code that goes into deploying running, monitoring and remediating things that go wrong. And that whole layer is taken care of for you in a PaaS. So that's the big Achilles heel. And then there's SuperCloud is can they turn cloud from a location, which is the data centers that Google operates into an operating model where it runs software wherever out on the edge in private data centers, in smart devices and can you take a common control plane or essentially take the cloud operating system and have it run everything for you? Then I'm going to come back to SuperCloud but let's go on to sort of the next slide here. How is Google going to attempt to fit the application services together? What do you think that's going to look like? Can you explain that? Okay, there's a bunch of layers here but the first one we think, and it's a question as to how much they'll be able to show next week. But they hinted at duet AI as being the sort of their equivalent of co-pilot the generative coding co-pilot that would be on every programming surface in Google Cloud. And so can they generate a lot of the glue code that simplifies composing and integrating a lot of the services that you would use to build data apps and how, so that's one thing is how much code can they generate? And then the other is how well integrated are those analytics services so that they fit together without chewing gum and bailing wire? Then the second issue is governance. Now that we are living in a data-centric world, you want common governance across everything independent of what service uses it. And so data plex is their service that has so far lineage quality and policy management. But can it cover all data types wherever they exist? And will that eventually cover the operational databases and will it cover like hybrid multi-cloud data like Databricks is Unity aspires to? And then there's the semantic layer because this is where you take technical metadata that says what tables and columns are there and you up level it and you say, what does this data mean in terms of say bookings, billings and revenue? Those are definitions that don't really exist at the technical data level. And so the question is what tools will be able to use the LookML semantic model for analyzing data and can that semantic model work across all data? Not just business intelligence metrics. So it's how far can they extend that? And then lastly, the data itself will there be one system of truth? Now it doesn't mean it has to all exist in one place but it could be one federated repository. I guess the technical term might be one namespace for all data structured semi-structured complex data like PDF images and video where all the services work on just that repository of all the data. So not just BigQuery but like Vertex AI, the big data services, the streaming data services. Those are the key questions we're looking to see answers to. Yeah, and to the extent that they answer those questions, George that's a pretty ambitious data stack that Google is going after us. And it's likely that's the direction that they're headed. It's a great point and that's why on Google Snowflake's opportunity is probably a lot less. Bob Muglia himself has said that if BigQuery existed on AWS or I shouldn't say existed, if Amazon built BigQuery, Snowflake would be a much smaller company. And that means the opportunity on Google Cloud Platform is probably a lot smaller for Snowflake because they have a great stack that is built around data. Yeah, thank goodness for Redshift if you're Snowflake, which you remember the history there was an on-prem database that Amazon refactored for the cloud. Anyway, you've said to me that DevOps is the Achilles heel of IS. Explain why and how will Google evolve and simplify its stack for DevOps pros? And you mentioned the Redmonk commentary before what will be the role of AI in this regard? Now, this is where I didn't read much from Redmonk on this topic. I think they didn't talk about this but this has been a theme that I've been interested in for years, which is we are moving from this model that Werner Vogels, the CTO of AWS, famously said that the ethos in the cloud is you build it, you run it. And if you look at this diagram in the lower left, it's like that old Verizon cellular commercial where someone's moving around with a cell phone trying to get service and he's got 100 guys behind him and he's like, can you hear me now? Can you hear me now? The equivalent cloud version is there's a hundred DevOps professionals following you around when you build an app trying to figure out how are you gonna deploy it, run it and remediate it when something goes wrong. None of that really exists when you're using a pass. That's why there's a huge tax for using infrastructure as a service. However, if you have a coherent application model which means do your pieces fit together in an opinionated way, are they designed to fit together? That means you can build operational intelligence into AI that can understand when things go wrong how to diagnose problems and with high confidence how to suggest remediation and you can set it so that if the confidence is high enough it automatically remediates it. So the question is how far along that spectrum going from you build it, you run it to autonomous AI ops where it runs itself. How far can Google start moving on that spectrum? They announced and talked about a whole bunch of services built around Anthos or Anthos that sits on Google Cloud build, deploy, run, monitoring and a whole bunch of diagnostic services. Can they build in enough intelligence to figure out when something goes wrong, how to remediate it and essentially to make DevOps engineers much more productive. We don't know, but that's a critical thing to watch for. Less guys running around in lab coats. Right. Makes sense. So AI in, I mean, that's likely gonna be an evolution where they may be sort of reduce the false positives or maybe help you prioritize or narrow down maybe with the human in a loop and then eventually when you can build up trust that automation occurs where it's a system of agency actually taking action for you. That's really where you see it going. Exactly. And I would add one thing. Google and Microsoft have an advantage here because they built their services to be opinionated which means they, they've erred away from giving developers power and choice with let a thousand flowers in the form of services. You know, Bloom, Google, I'm sorry, Amazon built hundreds of services many of which are overlapping for different use cases but then it becomes much harder to stitch those together. So both all three can do it but Google and Microsoft should have an advantage in being able to understand how those services can fit together and simplify the DevOps and let's see how far Google can get on that spectrum. So Snowflake was one of the first companies that we pointed to when we started thinking about this notion of super cloud. And they are, we think in a good example of that it's a single global instance that spans multiple, not only multiple availability zones or regions within AWS but multiple clouds. So that gives the capability of both abstracting the underlying complexity but also the potential of data sharing. Snowflake actually, I think announced, I guess announced this past Snowflake Summit, the ability to apply Snowflake credits to any service on any cloud. So you're running AWS and Azure which we see many customers doing. We talked about that earlier with some of the overlap data. You can use Snowflake credits for whatever if you're in their marketplace across clouds different cloud services, not just one cloud. So that's kind of interesting. So what is Google's super cloud play? Are they going to go there in your view? Okay, so with Google, they've been talking for a couple years now about starting to extend some of their application services to run on other clouds like BigQuery Omni which runs on Amazon and Azure. But then they also talked about having some of their control plane or the equivalent of the cloud operating system run outside Google cloud. So the goal here is eventually as Dave, you sort of pioneered this concept with super cloud which is having programmatic services that run beyond the cloud because your applications essentially are going to be running everywhere and you need a common control plane for those. So the question is how much of the native Google cloud control can you deploy on-prem on the edge like in a factory or even on a device where you can still control it from the cloud and then have that autonomous operation that we aspire to. Anthos was a key part of this. It started out as I think basically just stateless containers which meant a very small subset of workloads but they've been expanding that. So the question is how many more workloads, how much control and how close is it to the type of control you get running in Google cloud itself? Yeah, well it seems like this is an opportunity for Google. We know we've covered extensively. Everybody has it, they're in a distant third place. So they potentially have more motivation to build these cross-cloud services. We were just at VMware Explorer this past week. That's a huge thrust of VMware and under Broadcom, Hoctan has talked about that as a growth vector. So maybe Google could catch that wave as well and bring some added momentum to that marketplace. It seems at least at this point in time Amazon's not interested in that. And I think perhaps Microsoft is a little bit more interested but that's really not the main thrust. All right, let's end with some expectations on Google Cloud Next. Let's see, no question we're going to hear a lot of GenAI. I mean, it's going to be GenAI AI AI AI AI all day long. We talked a lot about BigQuery in today's breaking analysis. It's the underpinning of Google's data cloud is a true cloud native data platform, database data platform that has gotten very high marks in the marketplace. And as George was describing this sort of and we've been reporting since earlier this year there's new breed of data apps emerging, people, places and things with a coherent semantic layer. We think Google is in an interesting position to do that. This is also the first Google Cloud Next since Mandiant closed. I think the acquisition might have been announced last year at this time but Mandiant is obviously now part of Google. So you're going to hear a lot on security and it wouldn't surprise us to hear something about the retail cloud. We had in SuperCloud two, we had Jack Greenfield on from Walmart. He talked about their triplet model where they combined an on-prem open stack instance where their crown jewels are and they're using Microsoft for a lot of their collaboration apps and Google for a lot of the data apps. Of course, a company like Walmart who's a retail competitor of Amazon is not going to use Amazon. And so it makes sense that Google would try to suck up some of that business with the little discussion on the retail cloud. And finally, the cube will be there. We've got a set and John Furrier is leading. Lisa Martin will be there, Rob Streche. And one of our newer collaborators, Greg Xandavall is going to be there as well. So do you stop by and check that out? George, any final thoughts that you want to share? The only thing I would add is to emphasize on the retail cloud, Google, when Thomas Kurian came in, he made it a point of trying to differentiate Google Cloud from Microsoft and Amazon by emphasizing solutions. And they've bounced around as to where those solutions are coming from. Could they build pieces of them to they partner for them? Let's see how the story evolves this year. But that was the original differentiation. It was basically the data cloud coherence and solutions. So I think you're right to emphasize not just retail, but all industry solutions. Yeah, great. George, thanks very much. Again, always appreciate the collaboration. Love your slides and the deep insights. I really appreciate it. Thanks, good to be here. All right, and thank you for watching. I want to thank Alex Meyerson who's on production and manages the podcast. Kent Schiffman, who's in the air and not here today, but thank you, Ken. Kristen Martin and Cheryl Knight, they helped get the word out on social media and in our newsletters and Rob Hoth is our editor-in-chief over at siliconangle.com. Really appreciate that great editing. Remember, all these episodes are available as podcasts wherever you listen, just search breaking analysis podcast. I said last week we cracked a million downloads last month, really proud of that. So thank you, please subscribe if you haven't. We publish each week on wikibon.com and siliconangle.com and you can email me at david.volante at siliconangle.com or DM me at dvolante or hit us up on our LinkedIn post. Please comment, it helps. Even if the comments are negative, bring it on. And please do check out etr.ai. Great survey data. Thanks for their help today on short notice. Eric Bradley, Darren Brabham, our friends over there. We get awesome enterprise tech focus. This is Dave Vellante from theCUBE. Insights powered by ETR. Thanks for watching everybody and we'll see you next time on breaking analysis.