 Welcome back to theCUBE Studio here in Palo Alto for live performance here. Vast Presents, build beyond. I'm John Furrier, host with Dave Vellante, Rob Stretchy and George Gilbert, breaking down the vast announcement, big news, revolutionary data platform. Dave, we saw a great keynote. Well, John, I mean, you know, years ago when I saw Vast launch, I was like, okay, hey, another storage company. That's kind of cool. But, you know, as like they said, if you squint through their technical documentation, it was actually something, you know, a lot more that they just chose to expose through storage protocols. So it's pretty interesting. And George, you and I have been, you know, digging deep, Rob, you and I as well, into sort of the future data platforms. And these guys are bringing new thinking to the market in a way that, you know, starting to tweak, you know, our idea of what that future data platform is going to look like, not based on sort of old, now old modern data platform concepts, but it's sort of the new, new data platform. Yes, if you start thinking about data as representing the real world and you wanna learn about what's going on in the real world, then data becomes your API to the real world, a representation of the real world. And so it's spread out geographically, you know, across sites, across data centers. And then you have to rethink the platform. Like what we had as originally big data had its roots either in files, file and object storage for big piles of data or data warehouses. But if data becomes the platform, then you wanna be able to program or compute on it or learn on it independently of where it is. And then at the same time, we have a new generation of hardware that came along, storage class memory, all flash storage. And if you wanna build a platform that takes advantage of that, all the trade-offs that went into big data and traditional data warehouses, you have to rethink all that so that you can bring together files, objects, tables, you know, unstructured and structured data, streaming data, coming in transactionally, turning into analytic data. And then at the same time, AI models that can learn off that data in one sort of unified platform. It's a reimagining of our concept of a data platform. And if we step back and look at the news and the announcement, Rob, we've been talking about this on theCUBE, Dave, for years. And recently, most modern era, we hear semantic layer, the data's a big part of the change, AI's booming. This is a huge AI moment, deep learning is hot. Rob, there's so much going on in this announcement. It's almost a lot to unpack in only 20 minutes. Let's get a sense of what just happened with VAST data, because this is, I think, a revolutionary moment. It brings a new era of data developer mindset into a lot going on. It's not just storage. This is a data platform that they're doing here. What's your fast take? I thought, again, talking about how it's the new, new data platform, like the old data platform was just two years old and you start to look at how it's bringing. I think the fact that they're embracing how it can be distributed, I think the distributed nature of the data to exactly what George was talking about is a huge. It's in there by design. They've been building on this for years. I think that is something where you can't do that by just going to a hyperscaler and building on S3 or building on some sort of object. I think it has to be built into the platform from the beginning. I think it's also the distributed nature of the metadata and how the metadata management really works because to get at the data, to make the data usable, you have to have the metadata and being able to bring compute to that metadata is huge too. And I think really over time, we'll see that all of these services come together to really bring that and tie in. And I think talking about where they talked about things like Trino and other open source packages on top of it, brings it back to that data developer in ways that they can get at it. So those protocols like SQL and such that they can get at, the data is super important. One of the interesting things watching the VAS program that we just saw, I don't know if it was Renan or Jeff Denworth who said it, but you remember the Hadoop days when we were doing Hadoop world and the epiphany was we're going to bring five megabytes of code to petabyte of data or we're going to bring the compute to data. The exact opposite happened. All the data was shoved into the cloud. We moved all the data into the cloud to be closer to the computer. And they were talking about, well, you might have compute gravity or you might have data gravity and they're optimizing for that. So it's the first time that we've really seen a technology company really recognize and lean into the fact that data is distributed. You know, it's not about on-prem or is it in the cloud, it's wherever, it's under that anywhere. And that kind of resonates with me. Well, you guys wrote that post I thought of the Uber case study of the deep dive. I think that illustrates kind of where this is going. I love the name build beyond because they're talking about a future scenario. We heard Pixar on the keynote. We're going to hold on to the data for future use. There's a concept of future value of the data, not just getting value out of it, but having it there also mentioned the GDPR, that how the savings, just the nuances there in this video are interesting. This is an enabling platform. It sets everyone up, the global unified namespace. Everything kind of feels right, George. Feels like it's going to be this platform that is buildable. You can build on it. It's interesting where they did a really good job was saying, look, if deep learning is the primary application and then there's vertical versions of it as fraud detection or image enhancement and learning about all the complex data that you're ingesting, then the platform to support that is entirely different. So all this different data comes in, you're programmatically enhancing it. You might be enriching it. You might be correlating it. And then you might call AI functions that take apart images and say, what's in the image? And so you're adding the metadata and that metadata may be stored in tables. It may be attributes on the files, but there's this whole workflow where you're making more and more sense out of all the data that's coming in wherever it comes in. So you could bring the compute to the data, the compute functions, or you could bring the data to the compute if you've got GPU gravity, which is something that's new, that we haven't accounted for before. And so it's the data that's getting richer all the time and it's the platform that has all the capabilities that you would call on as for files, for objects, for cables and all the different types of compute functions. And then as a programmer, you don't have to care where any of that is. But I also think in just to add on to that, it's about the chunk size. It's about being dynamic at an updating very small pieces. It's sub-parquet file level updates, which I think helps with that gravity problem because data has gravity, but if you're only going after small pieces of the data and you're doing it in a distributed way, you can bring it to the GPUs. You can bring it somewhere else. Let's unpack this because there's so much here in this session. So you got data store, you got the database and you got the data engine. And Rob, you were saying there were three things that really you took away, and I want you to comment on it, the distributed nature of the architecture. You just mentioned the chunk size and it really is a metadata engine that can optimize where things should get done based upon whether it's policy or speed of light. Yeah, and I think this is when you look at how data engineering is happening and data developers, the data is not in one place. So you have to first be able to find the data. Well, that's metadata. So how do you find it rapidly so that you get the right data to the right actual application, data application, and build that data product? And I think this is kind of the layer stack of how do you have, it's kind of building a house, right? You have to have the right foundation to build those data features, data products and data apps on top of. So we've been having this sort of Snowflake V, Databricks, you guys were at the, you guys all three were at the Databricks show. I was at the Snowflake show and we've sort of been unpacking that as sort of reference points. You just wrote a piece on what is a data platform. Obviously there's a lot of futures in what Vast announced today, but George, where do you see this fitting in that sort of spectrum? Well, you remember when, I think we first saw Spark, when IBM first showed, IBM got behind it, it was like in 2015, I think. And the big revelation was like Hadoop was like, Humpty Dumpty after he fell off the wall and broke into a million pieces because it was like Hadoop was like dozens of different engines that someone had a couple together and it ran on prem. And then what Databricks did was they sort of unified all those different engines under a Spark execution engine and they put it in the cloud. So simplified, but you still basically had silos if you were running in different locations because it didn't really understand locations and it didn't, at first, it didn't have essentially data warehouse for interactive queries. And for the most part, the machine learning it was doing was on structured or semi-structured data. And it mostly was shallow learning, not deep learning. So now all of a sudden we come to Vast and we have something where it doesn't care where the data is and it doesn't care what type of data because you can call on all these different types of functions. So it's yet another unification beyond the way Spark unified from Hadoop. And now the platform still has maturing to do to add all the programmability that a developer would want in terms of the structure of a workflow and the semantics of the data so they can worry about things, not strings, but that's coming over time. So you have a much stronger, now, foundation to build on. And the architecture is unique in the sense that you think about Amazon. They've got all these piece parts. You know, George, you oftentimes cite, Warner Vogo saying, well, it's your fault, you know, audience, you wanted all this functionality. And then you look at the map of how you do stuff and it is really, you got glue in there, you got different storage buckets, you got 10 different data stores. It's a layer cake. And I think that is actually exactly George's, it goes from having, you can have it all and you can build it all, but it's all on you, as Werner says. And I think the problem with that is you have the limitations where they're building S3 to be globally acceptable object storage. It's not optimized for these types of data products and these types of data architectures per se. I mean, you're starting, you look at it and it has a small file problem. And so if you start to, you know, small object problem, if you start to look at how vast has gone the other direction and made that an advantage around updating and how the metadata is managed, that's huge. That's huge over what Amazon can do with S3. And I think that to me is where people for particular applications are gonna look at this and go, wow, that helps me with that. We heard deep learning mentioned in the video keynote. We heard about deep analytics. These are kind of the things that are gonna be abstracted away probably provisioned by operations team. So they got DevOps, but the developer piece is interesting of the application. Look at the triggers and functions. I think it's interesting. This brings up the question of, okay, the next generation cloud is here. We're seeing the AI impact. I think out of the gate, Vast probably will knock down some great AI work. We'll just saw that in the video. Really position timing-wise for the deep learning, Rob and Dave, I mean, all the things we're seeing with large language models, large foundation models, this thing is a perfect solution for that data use case, not just from an infrastructure standpoint, it's a DevOps. You got the developers and you got the ops. So how do you guys see the impact of this from an AI perspective and deploying it out with the infrastructure and enabling the developers? I mean, to me, it's scale, right? I mean, they referenced a couple of times, my words, I guess, but I think they use the words as well, learning systems. And if you think about things like autonomous driving, like fully autonomous driving, the barrier from you talk to deep AI experts is they're not true learning systems, right? And there's maybe because they don't have enough data to operate on. And so Vast seems to be moving in that direction, maybe not that it won't become a learning system, but it could be an enabler for these learning systems. And part of it is like in the keynote, in the upfront video where we saw, for instance, the schematic for how you would put together a self-driving learning architecture. And there was like dozens of components that you'd have to stitch together. This is, you can't get to the point where you can build higher level applications until you unify and simplify, or I should say simplify through unification, which is what the Vast architecture does, so that all the different types of data that's coming in, let's say from millions of cars on the road, and you're bringing in the telemetry, but the telemetry can include all the video clips of where the driver had to engage or disengage the autopilot. And then you're looking at all those disengagement events and you're saying, okay, here's what that's related to, and that goes into a pipeline that then adds to the autopilot learning routine. If you can break down the barriers between all those islands in the schematic, then it's much easier to build an autopilot learning system that is itself on autopilot. And Rob, it's pretty ambitious. I mean, a lot of times you see companies that come out of Israel, they solve a really narrow problem. I mean, you worked for one, at least one. This seems to be, you know, they're biting off a lot Vast, right? I mean, how do you see that in terms of the vision? And it's, you know, normally, you know, that part of the world thinks about things differently, but again, they tend to solve narrow problems. This is a big, big problem that Vast is going after. I think it's not been an overnight thing. That's the important thing to realize that they've been working since they started to really solve this problem, this data management data platform problem that's beyond just regular storage. And I think, you know, in the discussions we've had, it's you start to look at it and certain things have to come together. And I think with any younger company, you have to have that kind of focus where things, you know, it's time over target and having that mission. And I think that the Israelis have always been really good about that as well. But I think that, you know, again, looking at how you focus on the job at hand, and even as broad as it is, they started somewhere, they built the foundation, they learned from the customers, they started to hear what needed to be there. They went to the data catalog and metadata management. I think that that's how they build out. Yeah, and I think they nailed the unstructured structure. That's table stakes. I think the data space and the data engine of the real callouts here, because the compute side has got trigger and functions, very Lambda-like, it feels like very cool, serverless, like vibe there, but more importantly, the data space. This brings the notion of like namespaces, like the data. So that's a big idea, and I think it's going to have a revolutionary impact if this gets traction with the enabling the developers, because at the end of the day, data will move from infrastructure provisioning to automated delivery of data to developers. I think the developers will write apps for data. I think this is where your Uber piece ties us together in my mind, where this is the next gen to me. And so that's a bold initiative, I've asked. I think this is a big idea. And I think any other time, we didn't have this big AI wave, we're like, hmm, ambitious, aspirational. But if you look at the AI workloads, they're massive data, people are holding onto their data for compliance reasons, for legal reasons, for innovation reasons. So it's a perfect storm for Vast, in my opinion. I think this is a great time for them to roll out the mother of all platforms, if you will, for data. So it's a really good move. The go-to market's going to be interesting, right? Because you got product-led growth. I think Brennan said they're the fastest-growing storage company, right, in history. I think that's what he said this morning. But so, you got a deal with, yeah, we'll ask him. You got to deal with HPE, like, I mean, our HPE, I'll be all over this. Dell has made an investment in this company. And Dell, you know, lacks that whole data platform play. They don't really have anything there from a software standpoint. I know they're thinking about it. But, you know, this is like, you know, it puts Vast right in the conversation with the Snowflakes and the Databricks and the Cloud guys who are all going after this future data platform that you just wrote about that we've been noodling on for, you know, years, but really intensely this past, you know, six months. And that's their biggest challenge is that they used to sell to the storage buyer, I think. And now they're going to sell to a developer because this is a data platform. And so they have to reach that audience. Just as an example of the challenge on one of the analyst briefings, like, the first question that came up was, do you support now 60 terabyte flash drives? And they're like, you know, we... An analyst asked that question. Oh, God, roll your eyes sometimes. You know, I'm like, you know... Get on the glue train. And you know, but Jeff was like, you know, well, all this beautiful work we did and, you know, you're asking something like 47 sedimentary layers below. Yeah. I mean, the store, and this is where the storage layer goes down, commoditizes, but the abstractions there, as we heard, NVIDIA, some of the software stack there, I would say a hardware company, but really a software company. Rob, we were at the Linux Foundation event in Amsterdam. We even asked the question to the Docker CTO and the WebAssembly guys and said, hey, who distorts the data? Why can't developers decide where the data is stored? Why can't they program with data? And they're like, mm, that's interesting. Their minds were blown. So I think this is going to come down to that piece. Well, the AI will decide, right? I mean, and do probably a better job than... I think it's, they don't care, right? That they don't even care where the data is. It just needs to be accessible where they need it and when they need it. Okay, that's the keynote analysis. We've got the CEO coming up, it's Vast. We're going to ask him the questions about the accounts they're doing and some of the growth, Dave. We're going to get into that next here on theCUBE live and on stage here in Palo Alto. We'll be right back.