 From theCUBE Studios in Palo Alto in Boston, connecting with thought leaders all around the world, this is a CUBE Conversation. Hello everyone and welcome to this CUBE Conversation. I'm Dave Vellante. Now you know I love data and today we're going to introduce you to a new data and analytical platform and we're going to peek into the world of cloud database and data warehouses. And with me is Sagar Kedakia who's the head of Enterprise IT 7 Park Data. Sagar, welcome back to theCUBE. Good to see you. Thank you so much, Dave. Appreciate you having me back on. Hey, so new gig for you. How's it going? Tell us about 7 Park Data. Yeah, you know, look, things are going well. It started about two months ago just, you know, busy. I had a chance last, you know, a few months to kind of really dig into the data set. We have a tremendous amount of, you know, research coming out at Q4, Q1 around kind of the public cloud database market, public cloud analytics market. So, you know, really, really looking forward to that. Okay, good. Well, let's bring up the first slide. Let's talk about where this data comes from. Tell us a little bit more about the platform. Yeah. Where's the insight? Yeah, yeah, absolutely. So I'll tell you a little about 7 Park and then we can kind of jump into the data a little bit. So 7 Park was founded in 2012. In terms like differentiator, you know, with other alternative data firms, you know, we use NLP, machine learning, you know, AI to really kind of, you know, structure like noisy and unstructured data sets. Really kind of generate insight from that. And so because a lot of that know-how, we ended up being acquired by Vista back in 2018. And really like for us, you know, the mandate there is to really, you know, look across all their different portfolio companies and try to generate insight from all the data assets, you know, that these portfolio companies have. So, you know, today we're going to be talking about, you know, one of the data sets from those companies, it's a cloud infrastructure data set. We get it from one of the portfolio companies that, you know, helps organizations kind of manage and optimize their cloud spend. It's real-time data. We essentially get this aggregated daily. So certainly different than, you know, your traditional providers maybe giving you quarterly or kind of bi-annual data. This is incredibly granular real-time, all the way down to the invoice level. So within this cloud infrastructure data set, we're tracking several billion dollars worth of spend across AWS, Azure and GCP, something like 350 services across like 20 plus markets. So, you know, security, machine learning, analytics, database, which we're going to talk about today. And again, like the granularity of the KPIs, I think is kind of really what kind of, you know, differentiates this data set, you know, with just within database itself, you know, we're tracking over 20 services. So, you know, lots to kind of look forward to, kind of into Q4 and Q1. So, okay, so the main spring of your data is, if I'm a customer and there's a service out there, there are many services like this that can help me optimize my spend. And the way they do that is I basically connect to their APIs so they have visibility on what the transactions that I'm making, my usage statistics, et cetera. And then you take that and then extrapolate that and report on that. Is that right? Exactly. Yeah, well, we're seeing just on this one data set that we're going to talk about today, it's something like 600, 700 million rows worth of data. And so kind of what we do is, you know, we kind of have the insight layer on top of that or the analytics layer on top of all that unstructured data so that we can get a feel for, you know, a whole host of different kind of KPIs, spend, adoption rates, market share, you know, product size, retention rates, spend, you know, net price, all that type of stuff. So yeah, that's exactly what we're doing. Love it. The more transparency, the better. Okay, so, right, because this whole world of market sizing, it's been very opaque, you know, over the years and it's like, you know, backroom conversations where there's IDC, Gartner, who's got what, don't take, you know, the estimations and it's very, very, you know, it's not very transparent. So I'm excited to see what you guys have. Okay, so you have some data in the public cloud and specifically database market that you want to share with our audience. Let's bring up the next graphic here. You know, what are we looking at here, Sagar? What are these blue lines and red lines? What's this all about? Yeah, so, and look, we can kind of start at the, kind of the 10,000 foot view kind of level here. And so what we're looking at here is our estimates for the entire kind of cloud database, you know, market, including data warehousing. If you look all the way over to the right, I'll kind of explain some of these bars in a minute, but just high level, you know, we're forecasting for this year, $11.8 billion now. Something to kind of remember about that is that's just AWS, Azure and GCP, right? So that's not the entire cloud database market, it's just specific to those three providers. What you're looking at here is the breakout in blue and purple is SQL databases and then no SQL databases. And so, you know, to no one's surprise here, you can see, you know, SQL database is obviously much larger from a revenue standpoint. And so you can see just from this time last year, you know, the database market has grown 40% among these three cloud providers. And, you know, though we're not showing it here, from like a pie perspective, you know, database is playing a larger and larger role for all three of these providers. And so obviously this is a really hot market, which is why, you know, we're kind of discussing a lot of the dynamics into Q4 and Q1. So, okay, let's get into some of the specific firm level data. You have numbers that you want to share on Amazon Redshift and Google BigQuery and some comments on Snowflake. Let's bring up the next graphic. So tell us, it says public cloud data warehousing growth tempered by Snowflake. What's the data showing? And let's talk about some of the implications there. Yeah, no problem. So yeah, this is kind of one of the markets, you know, that we kind of did a deep dive in, you know, tomorrow and we'll kind of get this, you know, get to this in a few minutes. We're kind of doing a big CIO panel, kind of covering data warehousing, RDVMS, document store, key value graph, all these different database markets. But I thought it'd be great, you know, just because obviously what's occurring here in the Snowflake to kind of talk about, you know, the data warehousing market, you know, look, if you look here, these are some of the KPIs that we have, you know, and I'll kind of start from the left here, some of the orange bars, the darker orange bars, those are our estimates for AWS Redshift. And so you can see here, you know, we're projecting about 667 million in revenue for Redshift. But if you look at the lighter orange bars, you can see that the service went from representing about 2% of, you know, AWS revenue to about 1.5%. And we think some of that is because of Snowflake. And if we kind of take a look at some of these KPIs, you know, below those bar charts here, you know, one of the things that we've been looking at is, you know, how are longer-term customer spending and how are, let's just say like newer customer spending, so to speak, so kind of this like organic growth or kind of net expansion analysis. And if you look at on the bottom there, you'll see, you know, customers in our dataset that we looked at, you know, that were there at 3Q20 as well as 3Q19, their spend on AWS Redshift is 23%, right? And then look at the bifurcation, right? When we include essentially all the new customers that onboarded, right after 3Q19, look at how much they're bringing down the spend increase. And it's because, you know, a lot of spend that was, you know, perhaps meant for Redshift is now going to Snowflake. And look, you would expect longer-term customers to spend more than newer customers, but really what we're doing is here is really highlighting the stark contrast because you have kind of back-to-back KPIs here, you know, between, you know, organic spend versus total spend and obviously the deceleration in market share kind of coming down. So, you know, something that's interesting here and we'll kind of continue tracking that. Okay, so let's, let me come back to this, ask some Colombo questions here. So the start with the orange side. So we're talking about Snowflake being 667 million. These are your estimates extrapolated based on what we talked about earlier. 1.5% of the AWS portfolio, of course, you see things like EC2, they continue to grow. Amazon made a bunch of storage announcements last week at the first week of re-invent, you know, Kinesis. I mean, just name all kinds of databases. And so it's competing with a lot of other services in the portfolio. And then, but it's interesting to see Google, BigQuery, a much larger percentage of the portfolio, which again, to me, makes sense. People like BigQuery. They like the data science components that are built in, the machine learning components that are built in. But then if you look at Snowflake's last quarter, and just on a run rate basis, it's over, they're over $600 million now if you just multiply their last quarter by four from a revenue standpoint. So they got Redshift in their sites, you know, if this is, you know, to the extent this is the correct number. And I know it's an estimate, but I haven't seen any better numbers out there. It's interesting, Sagar. I mean, Snowflake surpassed the value of Snowflake surpassed service now last Friday. It's probably just in trading today, today, you know, on Monday, it's maybe, it's, Snowflake is about a billion dollars less than in value than IBM. So you're seeing Snowflake get a lot of attention. Post-IPO, the thing has even exploded more. I mean, it's crazy. And I presume that's rippled into the customer interest areas. Now the ironic thing here, of course, is that Snowflake, most of its revenue comes from AWS, running on AWS. At the same time, AWS and Redshift and Snowflake compete. So you have this interesting dynamic going on there. Yeah, you know, we've spoken to so many CIOs about kind of the dynamics here with Redshift and BigQuery and Snowflake, you know, as it kind of pertains to, you know, Redshift and Snowflake, I think, you know, what I've heard the most is, look, if you're using Redshift, you're gonna keep using it. But if you're new to data warehousing, kind of so to speak, you're gonna move to Snowflake or you're gonna start with Snowflake. You know, that, and I think, you know, when it comes to data warehousing, you're seeing a lot of decisions kind of coming from, you know, bottom up now, so a lot of developers. And so obviously their preference is gonna be Snowflake. And then when you kind of look at BigQuery here over to the right, again, like, look, you're seeing revenue grow. But again, as a percentage of total, you know, GCP revenue, you're seeing it come down. And look, we don't show it here, but another dynamic that we're seeing amongst BigQuery is that we are seeing adoption rates fall versus this time last year. So we think, again, that could be because of Snowflake. Now, one thing to kind of highlight here with BigQuery, look, it's kind of the low-cost alternative, you know, so to speak, you know, once Redshift gets too expensive, so to speak, you know, you kind of move over to BigQuery and we kind of put some price KPIs down here all the way at the bottom of the chart, you know, kind of from both of them. You know, when you kind of think about the net price for kind of TB scanned, you know, Redshift doesn't prorate, right? It's five bucks on whatever you scan in, whereas, you know, GCP, you get the first terabyte for free and then everything is prorated after that. And so you can see the net price, right? So that's the price that people actually pay. You can see it's significantly lower than Redshift. And again, you know, it's a lower-cost alternative. And so when you think about, you know, organizations or CIOs that want to save some money, certainly BigQuery, you know, is an option, but certainly I think just overall, you know, Snowflake is certainly having, you know, an impact here and you can see it from, you know, the percentage of total revenue for both these coming down. You know, if we look at other AWS database services or you mentioned a few other services, you know, we're not seeing that trend. We're seeing, you know, percentage total revenue hang in or accelerate. And so that's kind of why we want to point this out is this is something unique, you know, for AWS and GCP, where even though you're seeing growth, it's decelerating. And then of course you can kind of see the percentage of revenue represents coming down. I think it's interesting to look at these two companies and then of course Snowflake. So if you think about Snowflake and BigQuery, both of those started in the cloud. They were true born in the cloud databases, whereas Redshift was a deal that Amazon did, you know, with Par Excel back in the day, a one-time license fee and then they re-engineered it to be kind of, you know, cloud-based. And so there is some of that, you know, historical on-prem baggage in there. And native is a tremendous job in re-architecting that. But nonetheless, so I'll give you a couple of examples. If you go back to last year's re-invent 2019, of course Snowflake was really the first to popularize this idea of separating compute from storage and even compute from compute, which is kind of nuanced. So I won't go into that. But the idea being you can dial up or dial down compute as you need it. You can even turn off compute in the world of Snowflake and just, you know, you're paying S3 for, you know, storage charges. What Amazon did last re-invent was they announced the separation of compute and storage. But the way they did it was, they did it with a tiering architecture. So you can't ever actually fully turn off the compute but it's great. I mean, it's customers I've talked to say, yes, I'm saving a lot of money, you know, with this approach. But again, there's these little nuances. So what Snowflake announced this year was their data cloud. Now what the data cloud is, is a whole new architecture. It's based on this global mesh. It lives across both AWS and Azure and GCP. What Snowflake has done is they've taken, they've abstracted the complexity of the cloud. So you don't even necessarily have to know what you're running on. You don't have to worry about it. Any Snowflake user inside of that data cloud, if given access can share data with any other user. So it's a very powerful concept that they're doing. AWS re-invent this year announced something called AWS Glue Elastic Views, which basically allows you to take data across their entire database portfolio and I'm going to put share in quotes. And I put in quotes because it's essentially doing copying from a source, pushing to a target AWS database and then doing a change data management capture and pushes that over time. So it feels like kind of an attempt to do their own data cloud. The advantages of AWS is that they've got way more data stores than just Snowflake, it's one data store. So what AWS says Aurora, DynamoDB, Redshift, on and on and on streaming databases, et cetera, where Snowflake is just Snowflake. And so it's going to be interesting to see these two juxtaposing philosophies, but I wanted to sort of lay that out because this is just, it's setting up as a really interesting dynamic, then you can bring in Azure as well with Microsoft and what they're doing. And I think this is going to be really fascinating to see how this plays out over the next decade. Yeah, I think some of the points you brought up maybe a little bit earlier were just around like the functional limits of Redshift, right? And I think that's where, you know, Snowflake obviously does very, very well. You know, you kind of have these, you know, kind of two, you know, you kind of have these, you know, if you kind of think about like the market drivers, right? Like let's think about even like a prior slide that we showed where we saw overall, you know, database growth, like what's driving all that? What's driving Redshift, right? Obviously proximity, application, interdependencies, right costs, you get all the credits, right? People are already working with the big three providers. And so there's so many reasons to continue spending with them. Obviously, you know, COVID-19, right? Obviously all these apps being developed, right? And the cloud versus data centers and things of that nature. So you have all these market drivers, you know, for the cloud database services, for Redshift. And so from that perspective, you know, you kind of think, well, why are people even gonna go to a third party vendor? And I think, you know, at that point, it has to be the functional superiority. And so again, like a lot of times it depends on, you know, where decisions are coming from, you know, top down or bottom up, obviously, at the engineering, at the developer level, they're gonna want better functionality, maybe, you know, top down sometimes. You know, it's like, look, we have a lot of credits, you know, we're trying to save money, you know, from a security perspective, it could just be easier to spin something up, you know, in AWS, so to speak. So yeah, I think these are all the dynamics that, you know, organizations have to figure out every day, but at least within the data warehousing space, you are seeing spend go towards Snowflake. And it's going away to an extent as we kind of see, you know, growth decelerate for both of these vendors, right? It's not that revenue is not going up, there is growth, it's just that growth is, it's just not the same as it used to be, you know, so to speak. So yeah, this is an interesting area to kind of watch. And I think across all the other markets as well, you know, what do you think about document store, right? You have AWS document DB, right? What are the impacts there with Mongo and some of these other kind of third party data warehousing vendors, right? Having to compete with all the, you know, all the different services offered by AWS Azure, like with Cosmos and all that stuff. So yeah, it's definitely kind of turning into a battle royale, you know, as we kind of head into 2021. And so I think having all these KPIs is really helping us kind of break down and figure out, you know, which areas like data warehousing are slowing down, but then what other areas in database where they're seeing a tremendous amount of, you know, acceleration, like as we said, database revenue is driving, like it's becoming a bigger part of their overall revenue. And so they are doing well, it just, you know, there's obviously Snowflake they have to compete with here. Well, and I think maybe to your point, I infer from your point, it's not necessarily a zero sum game. And as I was discussing before, I think Snowflake's really trying to create a new market. It's not just trying to steal, share from, you know, the Teradata's and the Redshift's and the TCP's of the world big queries and Azure and SQL server and Oracle and so forth. They're trying to create a whole new concept called the data cloud, which to me is really important because my prediction is what Snowflake is doing, and they don't even really talk a ton about this, but they sort of do if you squint through the lines. They, I think what they're doing is, first of all, simplicity is what they're doing. And then they're putting data in the hands of business people, business line people who have domain context. It's a whole new way of thinking about a data architecture versus the prevalent way to do a data pipeline is you've got data engineers and data scientists and you ingest data. It goes to the beginning of the pipeline and that's kind of a traditional way to do it and kind of how I think most AWS customers do it. I think over time, because of the simplicity of Snowflake, you're going to see people begin to look at new ways to architect data. Anyway, we're almost out of time here, but I want to bring up the next slide, which is a graphic which talks about a database discussion that you guys are having on 12.8 at 2 p.m. Eastern time with Bain and Verizon. What's this all about? Yeah, so one of the things we wanted to do is we kind of kick off a lot of the Q4, Q1 research that we're putting on the database market is just like kind of we did today, which obviously we're really going to expand on tomorrow at 2 p.m. is discuss all the different KPIs. We track something like 20 plus database services. So we're going to be going through a lot more than just kind of Redshift and BigQuery. Look at all the dynamics there. Look at how they're faring again, some of the third party vendors like a Snowflake, like a MongoDB, as an example, we got some really great thought leaders, Michael Delzer and Praveen from Verizon, they're going to kind of help, or they're going to opine on all the dynamics that we're seeing. And so it's going to be a very kind of, structured-wise, it's going to be very quantitative, but then you're going to have this beautiful qualitative discussion to kind of help support a lot of the data points that we're capturing. So yeah, we're really excited about the panel, you know, from a, you know, why you should join standpoint. Look, it's just, it's great competitive Intel. If you're a third party, you know, database, data warehousing vendor, this is the type of information that you're going to want to know, you know, adoption rates, market sizing, retention rates, you know, net price, reserve versus on-demand dynamics, you know, we're going through a lot that tomorrow. So I'm really excited about that. I'm just, I'm just in general really excited about all the research that we're kind of putting out. So that's interesting. I mean, and we were talking earlier about AWS Glue, Elastic Views, and I'd love to see your view of all the database services from Amazon, because that's what it's really designed to do, is leverage those, across those, and you know, you listen to Andy Jassy talk, they've got a completely different philosophy than say Oracle, which says, hey, we've got one database to do all things, Amazon's saying we need that fine granularity, so it's going to be, again, and to the extent that you're providing market context, they're very excited to see that data, and see how that evolves over time. Really appreciate you coming back on theCUBE, and look forward to working with you. Appreciate it, Dave. Thank you so much. Thanks for having me on again. Thank you everybody for watching. This is Dave Vellante for theCUBE. We'll see you next time.