 Hello and welcome to theCUBE's coverage here. We are in the lake house at Databricks, Data Plus AI Summit, it's CUBE coverage. We are co-locating with Databricks' massive stage here. This is their media hub, Central theCUBE is going to be here for two days, wall-to-wall coverage. I'm joined by our CUBE analysts, new CUBEs and Rob Stretchy. Rob, we've been on a tour since CUBECon at Amsterdam. What a world win, and AI center stage. It's great to see you. It's data week too, so this is really exciting to be here and talking all about AI and data and how it's really going to turn. I just dropped an exclusive interview I had with Matt Garmin who was cheering the ship of Amazon to the promised land of generative AI. You Google out an event in Seattle, all developers. Snowflake had an event in Las Vegas, and here in Databricks in San Francisco, you got Databricks, it's data week, everything's happening, everyone wants that prize. The whole industry wants to win the mind share of this massively surging developer community. And we've said, Rob, we've been on the CUBE record now for months, open source will win. And you heard from Ali Ghazi today on stage, it's very clear that open source is a big theme of their history and their future and their present. And what's more exciting is he even dropped a few bombs out there by saying, we're going to end the format wars with this uniform layer bringing things together out of the metadata. I mean, bold move, Databricks really accelerating the game in the open. It's going to put a direct strike on Snowflake. It's going to hopefully level the play field. We'll see what's real. We're going to dig into the analysis. What was your take? Yeah, I think it was really well put together. I think there was a lot of going through it at rapid pace today. And I think they were really looking at how do they get to all of the content they had to get through. And I think a lot of it was around how is generative AI going to develop and how do you make it easier? I think also it was unified governance and how you can bring that together and democratization of analytics as well. And I think that's a big piece of it. Those were kind of the three big themes that they're taking with them and taking forward. One of the things I noticed knowing Ali Godsey for since the beginning of Databricks, he's been a candid Cube alumni. He was a mix of super excited and nervous because it was so crowded in there. I've never seen that audience that big for Databricks. He was delivering the lake house positioning 3.0, which was a combination of lake and house data warehouse. It's interesting, they're kind of reinventing data warehouse from the word lake around it, but that's kind of their focus. They're trying to get back into that world and modernize it, create a unified layer. And some things I liked was the delta sharing, the unity catalog and this format wars ending with uniform their positioning. I thought that was clever and just the sheer open source penetration that has legs and I think the question will be, will that be adopted? Yeah, I agree. I think it's going to be interesting. I think Ali was kind of pre-briefed us a little bit yesterday on this stuff and I asked some questions around some of these things and I think one that you're mentioning around the whole ending the war, it really, it's on by default, being able to do all those three different formats. Obviously Delta Lake is always there but the other two formats and bringing that in, I think what's key about this and what's going to be key in the future is that it gives much more spread across there and if you look at some of the other announcements that they're making, they're really focused on how they have a control plane, that data control plane going forward. And I think that's where the battle lines have been really underneath AI, that's where the battle lines are being drawn with them and Snowflake. Well, you know, we had Ali Ghazi on our SuperCloud event too, original I should say, he's won one, our inaugural event. He was also a keynote exclusive for our re-invent coverage last year and now with this Databricks data plus AI, every company will be a Gen AI company and that's his focus. You heard a lot of, not a lot of multi-cloud because we like to bring that term up, SuperCloud. Having Microsoft's CEO Satya Natala on stage or remote and Skyped in, they still use that word Skype. Not sure why they even keep that brand name, but that's a whole other story. He was Skyped in and it was a very interesting dialogue. One, two, it was not AWS, their big partner. They're doing more business with Azure so they're increasing their TAM. Databricks clearly going multi-cloud, so is Snowflake. This SuperCloud dynamic is happening, little bit kind of a submessage in the subtext of that kind of thing and I got to tell you, a smart move by Databricks by expanding in with Microsoft, they increase their total addressable market, they increase their customer base. JP Morgan Chase, again, joint customer of Microsoft. Satya Natala showing his chops, his technical, bringing infrastructure knowledge. Microsoft leaning in, winning the messaging war, winning the marketing war. It's pretty obvious to me that Microsoft clearly is outgunning AWS right now on the PR front, absolutely controlling the messaging and causing other clouds to be on the defensive. Yes, yeah, definitely, they're being reactive all the rest of the clouds and I think we've seen that over the last month and a half to two months, even maybe even a little bit longer and I think really when you start to look at how Databricks and Microsoft are messaging, it's really similar messaging in how they're trying to bring LLMs to the masses and how they're trying to really bring more out to them and be really clear about, hey, here's how you make it easy and I think one of the big things was, even though they're not calling it an assistant, which is really, you know, they're inside of their Lakehouse AI and what they're trying to do with the assistant that can help you build the queries for your Lakehouse and I think that's key. It's their version of co-pilot and there was a lot of talk with Satya about co-pilot which makes a lot of sense. The other thing I did announce was the marketplace was a big part of their initiative. Again, it's open to everyone and they kind of dinged the clouds by saying lock in and saying the clouds only optimize for their clouds which is true. So they're trying to build this federated kind of marketplace to be open. I thought that was interesting. You know, that combined with the Delta sharing protocol, you got some interesting open source power dynamics going on. I think that's going to be something we're going to unpack a lot over the next second half of the year, Rob. I agree, I agree and not only that but also their clean rooms initiatives which is the Delta sharing and when they started to look at Lakehouse apps which they didn't really go into that much but we kind of talked about it a little bit yesterday which is, again, under the hood, it's Kubernetes and when you start to look at it it's going to help some Greenfield customers extremely quickly get into a marketplace and be able to have security and governance over that by running on Databricks' infrastructure. You know, I always liked Databricks, I always liked their approach, I always liked the conversations but one thing that's notably different this event is they're stepping up their game. They got Moscone, 12,000 people's packs, 75,000 people online, both South and North packed venue. They're brand, they're stepping up. They could be a big player. They could be the dominant player in the modern data warehouse world which is essentially data cloud. They could be the dominant player for what we're talking about a lot here on theCUBE is having data products in a marketplace and then ultimately the term we coined called data developer, Rob, is really the emergence of what will be a new persona if these dots connect. If Oligotsi's vision, combined with what Matt Garmin told me in my exclusive, is that as this scale continues from data infrastructure reset with Generve AI, the data products enabling a data developer, those apps are going to come online, they're going to be running on the cloud, this will be a new power dynamic and the developer will be in charge of the data. We saw this with DevOps, we saw this with security now, shifting left, the new power dynamic will be the data developer. We're going to see it emerge in front of our eyes and we'll be covering it like a blanket here on theCUBE. Yeah, I think the data developer joined with platform engineering and what you have is them actually going out and engineering that data control plane. And I think with Lakehouse Federation, one of the announcements today, it's another one where both them and Snowflake are competing for the heart and minds of both platform engineering and that data layer, data engineering layer. Dave Vellante loves horse racing analogy, so we have to go with one here. A lot of horses on the track, they're clustered up, they're all kind of like jockeying for position, they're all kind of in a pack right now, some are ahead, some are not, but ultimately the other dynamic is people might be switching horses. You know, Amazon to Azure, this model to that model, choice is becoming a huge factor before putting things into production. We're still in discovery phase with generative AI and LLMs and foundation models. So again, tapering down the hype, which is all time high, generative AI is a game changer. However, it's early, we're still in discovery mode with developers, we're seeing people thinking about do I have the right horse for this? Is this the right model for that? How do I tune it and scale it? How much is it going to cost? Is it ethical? Do I run it on the cloud? These are all questions that are being asked here and in the industry. Yeah, and it's fascinating with the customers that are here and having some conversations with them already. Really what they're betting on is how are we going to be able to keep our data private? That message was super strong in the keynote today about privacy, about keeping your PII private, being able to actually monitor that and I thought that was a neat little demo that they did as part of that. I think again, there's a lot of hype around it. There's a lot of things that need to be figured out. A lot of swirling conversations around which cloud's going to be the leader. Databricks is not holding back. They could be an emergent kind of super cloud layer across all clouds, something that we've been tracking. I had an interview with Matt Garmin who runs the heads up Amazon's field and sales and operations, former EC2. So he's got technical chops, kind of like South Indian Nutella flexing his chops. I know Garmin's got some technology chops, which gives him a unique position. He was echoing in my interview with him some of the same points that Ali Ghazi was saying on stage and I want to get your reaction to these points if you don't mind. One, the AI potential and ethical concerns. I'll just run the list and we'll go through them. Securing data, amplifying choice. Augmenting human capabilities with AI versus replacing them. Innovating around some constraints and opportunities around GPUs. Little nuanced point we heard today. So let's start with AI ethics. Of course, adoption and consumption of AI. So let's start with the ethics piece. What's your take? What did you hear here? They kind of had the proverbial, it's not going to tell, whoa, we got to be careful. You know how I feel about that. I think everybody wants to be careful but I don't think anybody knows how to be careful yet. And I think it's early days in figuring this out. I think you'll see the industry come together as regulation is ramping up. I mean, you already have the first bill of regulation going through the EU. I think they have to get in line and they have to nail the ethics part of this because it's really key to not having it over-regulated or pushed back or stamped down. I'll say what everyone's thinking. I think it's virtue signaling. I think it's BS. I think there is no notion of regulation at this point. We got to get this stuff out of the garage and into the streets, into production. We got to watch it monitor it. That's the only issue. Regulation is a bad path for AI. I think that is complete horse, you know what? I think we got to avoid that. I think everyone's just virtuous. Oh yeah, we got to be careful. That's code words for don't regulate us. So I think that's one thing. The other thing that's the ethical issue that comes up that's not talked about is the intellectual property rights. This is what Ali was all about today. Your IP is your data and their entire thesis of lake houses. Don't put that into the LLM. Matt Garmin and AWS also echoed that sentiment. Hey, that corpus is public. Once the genie's out of the bottle, you can't put that back in. So this is a huge issue around data and then how the data engineering, platform engineering has to now refactor to drive these new data products. Yeah, and I think even Satya brought it up with watermarking and things like that and how do you deal with copyright? How do you deal with intellectual property? These are huge and I think that to your point, most people, I was talking to a healthcare organization just earlier before coming on here and they were talking about how they're using Databricks but they're using it in a very private way because you're talking about people's healthcare information. You've got to be careful with that and right now you can't be using these SaaS-delivered public models that are proprietary to go and do this where you don't know where the data's going to end up. I got to talk about adoption real quick, fostering consumption and adoption. You heard that out there today. They're trying to push ease of use within the platform. These customers are diverse. JP Morgan's different than a startup, right? So you have different diversity of solutions that are needed from the infrastructure layer all the way down to the, up to the application layer. You're starting to see packaged solutions and hardcore code. So infrastructure, silicon, GPUs to the application stack and everything in between is going to be a huge adoption. Yeah, and I think that they actually had some really interesting stats on stage. They're seeing from actual use inside of Databricks where two exabytes of data is processed every day through Databricks and that in the last month they had over 1,500 actual transformers used within the Databricks environment. That's showing you some of the adoption and how it's really accelerating. That's up from 400 just earlier on in the year. So the other thing is cloud-enabled growth. One of the things that's come up is scale. Ali didn't really talk about it in the keynote because they're not the cloud, obviously they're in ISV by as far as the Amazon's concerned. But they got an ecosystem, 12,000 customers in person. So as you look at the cloud enablement here, you got to have the scale. And again, if people are going to be switching horses, how do you decide how to run this stuff on cloud? Do you run it on Databricks? On which cloud? That becomes an interesting factor here. The clouds are enabling at the end of the day all this innovation. And I think being across the clouds like you were talking about with SuperCloud, it does give you that data layer, that commonality of data layers across clouds. And I think that is a big key. They're not everywhere yet. I mean, there's still a lot of sovereign clouds that can't get Databricks. I think that that will be a challenge for them being that control plane long-term. But for a good portion of the companies and countries, they actually do a good job of that. And we heard about Mosaic ML with the GPUs. Yeah, you and I found out through the Grapevine, they got a stockpile of GPUs, but a lot of CapEx to build out. The GPUs are a big factor. One of the reasons why the cloud's looking good right now is AWS, and I'm sure Azure's got a bunch of GPUs. They can offload the GPU needs as a managed service versus having it on site. And that's an opportunity for Mosaic ML here with Databricks on the training side. Right, yeah, I think it is. And I think where they're aiming to be, the trainer, the builder, the underlying, I guess you could say infrastructure for AIML versus trying to be the ultimate only model you use. So it's the bring your own model concept, but how do you get to success faster with your data? That's where they're focused. And final point is enterprise demands are very much in line here. JP Morgan on stage, amplifying, almost gushing over Ali's celebrityism. Clearly a testimonial on stage. I mean, who's basically a walking data sheet for Databricks, highlighting the advantages of Lake House. He was almost as he was reborn. Like, oh my God, I came from the old way. I thought that was a very telling interview, really showing the needs of the enterprise is about governance and nailing some of the things we were talking about on our last cube, which is nail those foundational elements and then you can enable the value. That's what the platforms offer is that kind of enablement. That's a huge part of Ali's talk today. Yeah, I think they had Unity Catalog weaved within and throughout the entire keynote. And I think that they're standing on that as a differentiator for them and how they bring that to market over all the data layer. What's your final thoughts on the interview? Why don't you give it a grade? How would you rank it? Obviously they had a lot to go through. What's your thoughts? I'd give us a solid B. I'd say we'll be bringing it up a notch as we go through the day and get with some more people and get some more interesting content going. But I think we, you know, again, we covered all of the interesting facts. No, I'm talking about their keynote. Oh, their keynote? No, we're in A. Come on, we need to talk about B. We're in A. We're always at the A. We always got the A game. I would say, I'd give Ali, I mean, again, they had so much to try to unpack. I think that it got a little muddled at times but I'd give him a B. I think it was really good. I give him a solid A minus because it's hard to get an A out of. I give him an A minus because, and pushing an A only because I thought they had the clock, they had a clock problem, so much to go through. I give him an A minus because, one, JP Morgan was an incredible guest. The way they were just glowing with testimonials. You don't, that's not stage, that's real. That's authentic. I can squint through that and say that was pretty real. Happy customer. His positioning was amazing. The Mosaic ML acquisition on the heels of the event was a home run for Databricks. The venue, the posture. They're standing tall right now and if you're snowflake, you got to wonder, there's almost two types of companies here, Databricks, Snowflake, both with different vibes and you got to look at that and say, it's like the republicans and the democrats. It's like, who's on side you going to pick? And they both work together though, so I thought that was a great move by them and the content was very developer education focused without placating to the crowd too much. They weren't talking down to the crowd. They aligned with the developers. I think this data developer theme is right on point for them. I think they're going to wake up and realize that they have an ecosystem that's going to be with them for the next 20 years. And so that room was very much aligned with some of the early days of VMware. It's a lot of demos, a lot of learning. So I thought A minus for sure, pushing an A, but again, the clock was tight. They had to go fast, overall great event. Yeah, I agree. I thought it was very energizing to say the least. All right, that's theCUBE's keynote analysis here inside the lake house on the floor of the Databricks AI plus data events summit. And you get theCUBE ongoing coverage two days, wall to wall. I'm John Furrier, host of Rob Stretching. We'll be right back with our next guest after this short break.