 Good afternoon, guys and gals. Welcome back to theCUBE, the leader in live tech coverage. But you know that because you've been with us for over a decade, right? I won't even say since the last segment. Lisa Martin here with John Furrier. We have an 11-time alumni back on the program, Vaughn Stewart. Vaughn, this is your 11th time that we know of. Yeah, that we can count. That we can count. It's in the database. It's far in the XDR archive goes. And as I do, I stalk people on LinkedIn and I see that you've recently achieved vast data, vast or not status. Congratulations. What does that mean? And do you have a onesie like the suit, I hope? I have a onesie, but it's different than the suit. But I need, yeah, I need my astronaut kit. Look, I'm incredibly honored and privileged to have joined Vaughn. Jeff Fenn, Ren and I had been doing the recruiting dance for a little while. And look, I had a great run at NetApp, right? Together, the vendor that, they didn't invent NAS, but they became the NAS leader, changed the market dynamic, had a very extensible platform. Very fortunate to join Pure at a very early stage as well. They resolution, again, not the first all flash vendor, but they revolutionized what it meant to be successful as a flash vendor. And if you had asked me a couple years back, what's next? And it would have said anything but storage. And then Ren and Jeff were like, what do you think of Vaughn? I'm like, I think your architecture is really neat, but I'm done with storage. And they're like, let us tell you about where we're going. We're not storage, we're kind of storage. Let me tell you about where we're going. And it just like lit up all the synapses. And I'm like, okay, I need to get there. I mean, they have a go big or go home message for sure. You know, referencing the whole Silicon Valley, you know, got to go big or go home, fail or just knock on the park. It's looking really good. We had the launch event in our Silicon Valley offices to Cube. And what got my attention is they're kind of not a kind of a storage company because they're kind of got a new way to think about the data tsunami that's coming or here. And by the way, cybersecurity and any vertical, the data's going up exponentially, but budgets aren't. So you have a practitioner gridlock problem, skills gap problem. This is actually the reality of the current situation. Nevermind the go reinvent our infrastructure for AI. Those three things are huge current situation dynamics and people are kind of up to their eyeballs with like, oh my God, what do I do next? So no one's really kind of doing anything. There's no real global GSI out there saying, here's how you move all your data to AI. Here's how you completely transform. So I find it very interesting that that's in the center of this disruption. What's the playbook? I mean, what do you guys advise customers? Are they ready to move? Is your product fit for this new use case? Take us through the value proposition. Right, so are you going to say something? No. Okay. So let me, for your audience, I'll just set the stage real briefly. So Renan and our founders set out to address a challenge that's been in our industry regardless of disk or flash or sand or NAS, doesn't matter the vendor is. And it was this trade-off that you had, which was I can make highly performant storage or I can make low-cost, large-scale storage capacity, but you can't have both. Do you want deep and cheap or do you want highly performant? And the whole industry was trained around this model. I'm going to put my most recent data in the high performance and I'm going to let it age out. And so we made all these tiers, if you will. And the problem with tiers is once you start shuffling data around, the ability to get back to that data is very difficult. They really looked, they had a really good vision about looking up and saying, is AI a standalone process or does really the notion of AI or maybe more importantly accelerated computing, whether it's GPU, TPU, IPU, is this the next wave of the future? And that's kind of where I think we're very different than the storage industry and why we're a data platform than a storage array. Because what we believe is that you're going to have accelerated computing all throughout your infrastructure. And much like what VMware, when they ushered in the cloud wave, we're at the beginning of this need for customers to go in and make their infrastructure GPU capable or AI capable. And it's not a silo. And I have all kinds of stories to share with you around whether we're going to talk about HPC, whether we want to talk about large language models, whether we want to talk about GPU accelerated tools in the enterprise. And so where do you want to start? Well, the GPU accelerated tools, that's like a current state of the art process that all enterprises have been adopting. Nvidia's been doing great. Crypto's kind of one of those use cases, but that's kind of lost favor. Incomes AI, that kind of brings Nvidia to a whole new level. So you got acceleration, which is its own category. AI is booming. So everyone's just hoarding the GPUs. That's the story here. There's a supply chain issue, for sure. And what's really incredible for the size of the company that we are, right? We're just shy of 700 people. We are the storage platform for the major GPU cloud providers. CoreWeave, which is Nvidia's first and only elite GPU cloud partner. Core42, which was formerly G42, Lambda. And there's more. We're just not ready to share that with you. Lambda's new, isn't it? Wasn't that only announced last month? Relative recent announcement. Lambda Labs, paper space. Don't know about paper space. Yeah, they're in the Nvidia DGX kind of circles. What's really interesting there is all of those clouds started like, like an HPC environment would be. They went and they bought a parallel file system, right? We're bringing in the GPUs. We're bringing in all the software that's GPU optimized. Obviously we need high performance. So you go to HPC storage. Parallel file systems. Kind of where GPUs were first consumed at scale. Computer, it's Linux grids. And what they found was the parallel file systems that use an HPC can feed a GPU, but their boutique and their fragile so they're not online all the time, and particularly being in the service provider space. Like parallel files, they're great if you can tune for the workload. You can't do that. By definition of cloud, you can't do that. Cloud is, you have to support whatever's coming in and it's unknown. You mentioned CoreWeave, CoreWeave's focus. They're basically an AI primary GPU based virtual machines. Yeah. Okay, Nvidia DGX enables that. And we've been kind of having a chat on theCUBE and riffing up, David Volante and I have been riffing on this on our CUBE research team. Remember the old white box days in the 90s? You're starting to see these tier two clouds emerge. So they're not saying I'm a tier two storage. They're saying I'm tier two cloud. They're not, we're saying they're tier two. We say super cloud tier two, but it's almost like Amazon's got all this assets. And you've got these now new environments emerging from the semiconductors, enabling a new bare metal like environment to be purpose built or specialty based. This is a whole new wave coming. Do you agree with this? It's a whole new wave. And if you look at, just look at like Lambda, right? The number of research institutions that are their customers, whether you want bare metal or virtual machines, so you can do dedicated or shared infrastructure and their costs are significantly lower than the hyperscalers. And look, we've got partnerships with all of these vendors and engineering efforts that are in tow with all of them, but they've found their niche. Which is to go and address this market in I think a more agile and more affordable model. Now let's see how that scales over time. Stand up pretty quick. So you're selling storage to them, or are you the storage provider for them? We're selling a data platform. Data platform, okay, it's good. I want to understand, given the hype cycle of AI and we've seen this massive acceleration in the last year, it was Dave Vellante always jokes that AI was born in November of 2022. But obviously with chat GPT and the rise of that, everyone's talking about it. How does Vasty to help customers get ready for the AI revolution? So it goes from the hype cycle to really making business impact. That's a really good question. And I think it allows me to actually kind of go back to a point where I talked about these clouds started with like HPC storage and then didn't meet their needs or their infrastructure. And so we're at this chasm, if you will. The lines are clear, right? Customers need large storage capacity so they've got an economic element to it. They need HPC performance to feed GPUs. If you can make performance equal to HPC, now the requirements change. And now customers say, I want enterprise grade resiliency. I want secure multi-tenancy. I want data encryption. I want quality of service. I want a global namespace. I want to be able to burst into the cloud. This is all the fundamental kind of first chapter, if you will, of vast data about what we've brought to market. And I kind of share with you earlier. I said like the last 10 years, right? The storage industry's been focused on adopting Flash. What's happened in an adjacency is you've seen all these data platforms pop up, right? Kafka, right? So it's got stream processing going on. As volumes of data sets have come, they need to be enriched with metadata tags. And so these start to become databases that sit around the storage array. And so we're just listening to our customers who have the largest data and the most demanding needs. And they're saying, I'd like a table. I'd like everything on the data platform indexed. I'd like it to be available via SQL queries so I can process this data faster and I don't have siloed infrastructure. And as we announced at Build Beyond, right? We've got triggers and functions coming. And so I go back to this is what innovation looks like. There isn't a storage array that does any of this. So that's why we're trying to define a new category here around a data platform. I think that you're right about the HPC kind of angle here because yesterday at the Dell community event they had, they co-located at the Sheridan, the tax CEO said AI vindicates the HPC way. Meaning in other words, everything that they've been thinking about what they stand for has been kind of valid. They take it on a level, but also gives them more paths to commercialization, not just research and doing the long game. Okay, so that's, I believe that to be true. Then the next question is, okay, what's going to happen next? And we've been saying on theCUBE, oh, this year, at least I can't remember how many times we've been on theCUBE together saying this, that there's going to be a radical disruption around data management. If you look at how things were done in the old way or the way that's going to be soon to be old way to breaking, bringing in the AI way, it's changing significantly. You can't do governance the old way in silos. You got to build governance in from day one. You got to have data addressability. You need speed, low latency, multiple database formats. So the idea of how you make storage decisions goes away. So I buy that argument that compute platform is coming, hence the AI vindication. Every major market shift in our industry has fundamentally been fueled by taking something that has a lot of steps collapsing them, making it simpler, because when you can simplify it, then it allows the adoption to scale. And so if you're, again, if you're thinking about, we tend to have two mindsets when we talk to folks who are acquiring GPUs. They either think it's a silo, or there's those who have a three and a five year perspective and go, actually my whole infrastructure needs to be AI or GPU enabled, and we're in that latter camp. And so with most of our deployments, it actually starts very siloed. But what separates us from the parallel file system folks is the customers understand, I don't want to be shuffling data back and forth, let's say between my digital assets and some accelerated computing tools. I want the accelerated computing tools to read from the source, because that reduces time, increases efficiency, reduces my cost. And so I think we're going to see this transition and we're a leader in the data platform space to make that possible. Yeah. And the other thing that came out of the HPC community meeting was the optimization question. Do you optimize for more compute, GPUs, TPUs, and more cubes, PUs, whatever processing units you have, or networking? And so this is an open question of as you start thinking about the modern infrastructure, you got to, they want the X of loss, they want the performance, what else can we get? So that's an open question. What do you optimize for? If you have a data platform, what happens next? Is that you want a low latency? Most of the people in the cloud I talked to, the hyperscalers, they're like, we kind of figure out the processing game, we got to figure out the interconnects and the networking. That's a big part of the story here. What's your reaction to that? Do you agree and how do you look at customers who have to figure out how to knob up and turn the knobs and get optimized on? So you bring up the network being a bottleneck and we see that, right? Whether it's network speeds or the PCI buses or the systems that are connected to them. So the GPUs can overrun a lot of the interconnects today. And so we have to work with customers through power considerations, fabric considerations, but when you kind of boil it back and I'll come back to kind of HPC because we're at supercomputing, even that market's changed, right? HPC used to be performance at any cost. We got to install clients and you'd build out a grid and that grid would age, right? You'd let Linux servers get five, six, seven years old. Cyber security comes in now, right? You get a security patch, you got to go do a kernel update on the grid. I got to go update 2,000 clients. I got to go update the driver for HPC and it's introducing more downtime. And so now when we talk to the research centers they're saying they're telling us in droves, it used to be performance at all costs, but I have downtime. Now to attract a researcher or a research team, they want more parallelization. They want to research more, right? I need to be online. I need to be able to do it in parallel. And again, this is where the market is shifting to what we're doing. And again, don't take my word for it, talk to TAC. Yeah, TAC's awesome. TAC hosts the Luster User Group and now they're a vast customer. One of the former founders of Luster is the CTO at Doug. Doug's been a multi-year customer of ours. Founder of BGFS, works for us. So you're seeing a slow realization in that industry of I may be comfortable with managing parallel file systems because my team's done it for 20 years. I moved to Vast and I don't have to take any of those out just because we don't have those maintenance constructs. So it's a paradigm shift. So it sounds to me like Vast data is a rocket chip. I mean, you are a Vast or not. I am a Vast or not. I'm a certified Vast or not. In your opinion, as the VP of Systems Engineering for Vast data, what's the secret sauce? What was it? You kind of alluded to this in the beginning about from a career perspective and you were like, no more storage, but you saw something there. What was that secret sauce that you're taking on the rocket? When you can see where the market is going and referencing back to I'm a firm believer of accelerated computing will be the norm in every data center, whether you're on-prem in the cloud, the hybrid. So when you can see where that's going and you can see where there's a seismic delta between HPC systems that are fast, but boutique and fragile, enterprise storage, which is robust and available online but not performant enough to meet the needs of accelerated computing. And you sit down with the founders of a company who are like, we've identified this. First is to solve the performance cost scale challenge. Next is to go and start to fold in all these data services, data processing platforms to give a more richer set of leveraging your data, reducing time windows, reducing moving data around, reducing costs. When you have that capability, you can become a strategic partner to clouds, to industries. You become a strategic component within their product process and development chain. That's what's really interesting to me. You know the thing about storage and networking and servers now accelerated computing that you're referring to is architectural decisions. You got to make trade-offs. One concept that's been here in the cube this week at eight, super computing is disaggregation, memory disaggregation, fabrics are back, interconnects ethernet. I mean, I was talking to someone at Broadcom and they're talking about 400, okay? So okay, ethernet's getting faster. So it's a systems architecture mindset going on right now. So disaggregation and silicon diversity is a big topic. What's your, how does that connect the dots for this new environment we're moving into? You know, I'm not sure we set that question in, but I love having this conversation. So what do they say at our industry, right? There's nothing new, right? There's just a new version of it. So back in the 60s, you had compute rooms and memory rooms and storage rooms and everything was, you know, disaggregated. And so now that we've got fast interconnects with low latency and high bandwidth, there's this notion now, do we disaggregate the compute layer? And if you take the next step and say, okay, well, how do I disaggregate to a storage layer or a data platform layer, I would challenge you to find any vendor that's here that's in a better position to leverage the disaggregation than vast data because we are disaggregated by nature. Our architecture is comprised of compute nodes that are stateless, that run containers, which are our points of protocol access, connected over a low latency, high speed fabric, to J-Boffs, right? There's no intelligence in our storage and controllers. And so we are a precursor to what John, I think what you're alluding to here. And I think you'll see more from us in the not too distant future that's going to continue to further disaggregate this architecture. It's an ecosystem boom because what's happening is now you have this new enablement, and what we're seeing here is almost like a commercial ecosystem booming. And the new functionality, Lisa, we've seen this movie before, VMware ecosystem. You've know that ecosystem well. I might've done a little bit. I might've done a little bit in that space. You know, open source is booming right now, open source as hot as could be. AI obviously has got kind of a systems aspect to it that people are trying to figure out at large scale, but certainly the training stuff's key that at KubeCon last week, Lisa, we heard this phrase from a Google engineer, Tim Hopkins, cube alumni. He said, quote, inference is the new web app. Okay, implying that that's where the value extraction is. So, cost side training, train with the GPUs, throw everything at it. Inference is what happens with the data. This is why the data architecture is going to be hugely important. What's your reaction to that? I agree. If you look at everything that we're building in the data platform, sure, there's cost benefits. It's really about time optimization, walk walk time. So for example, if you look at like our Trino push down, right, I can put Trino on an all flash array that gets 400 microseconds of latency. Or I can put it with the push down and talk to the vast database where we're able to actually break apart the parquet file and instead on a query return parquet files, just return the data that was being searched for. And as a result, why we are in order magnitude slower than say an all flash block array, our results are 20 times faster. Parquet is going to change the entire data game. You watch, Parquet will absolutely democratize data leaks. It's going to destroy. It's already happening. It's going to destroy the advantage. All of our poor customers that have lived with HDFS or it's the subsequent Kudu and all of this noise. It's all just being wiped out by S3 and Parquet files. And again, skate to where the park is. Open source, baby, open source. My last question for Yvonne is in our last minute here. You've been there a short time, but you've obviously saw a lot there. You described the rocket ship, the secret sauce, the differentiators. What's your favorite customer story that you really think shines a light on some significant business outcomes that vast data's data platform is enabling customers to achieve? I'm sure there's many. There's a lot. I've been blown away. The scale and the size of the customers, the scale and size of the deployments and how customers are using our technology across life sciences, media and entertainment, research. It's all been just a whirlwind that, it fires a lot of endorphins. I think right now it's the customers that have, some customers are like AI is a silo. If I'm doing a large language model, I'm doing a training, I want to have a better chat bot. Okay, maybe it's a silo today. What really moves me is the customers that are able to actually say, my mindset is to bring the GPUs to the data, not bring the data to the GPUs. What I mean by that is an AI enabled infrastructure versus am I just doing a modern version of ETL? I'm going to bring my, whatever, my life cycle data into my AI silo process and put the data back. That feels like a very tactical mode. Now the conversation is bring your workload to the data, not just the GPU, bring the entire workload over. Home of the ball game. So obviously the data grab is big, but what really kind of blew me away and then so kind of going back to customers, I didn't know that I was skeptical that customers would want to put their key assets in the cloud for lots of reasons. IP leakage, data sovereignty, compliance and regulatory reasons. Then now that we've got our hybrid capability, and so we've got a global namespace that'll span across multiple sites, but now also into the cloud. And you can replicate your data if you need to, but what's been really interesting is the number of customers that are like, look, I just want a cash view of my data from the cloud because the cloud has tools that I don't have and I want to run those tools on my dataset, but I don't want to move my data because egress charges means I'll never be able to get it back. But I want to make it accessible, like my data accessible to these tools, and maybe it's a tool today on cloud A and tomorrow on cloud B. And so this notion that I think the industry's fought with around like, how do you make data universal and not get bound to all the constructs that, look, the secret sauce about any compute platform is data is what holds you to the compute. And so we're breaking some of that as well. I love not making data universal. Vaughn, thank you so much for joining us in your 11th episode that we know of on theCUBE. Congrats on what you're doing with Bastada as VP of Systems Engineering. We'll be keeping our eyes on this space. Always pleasure. Thank you Lisa, thank you John. For Vaughn Stewart and John Furrier, I'm Lisa Martin. You're watching theCUBE's live coverage of SC23 from Mile High City, Denver, Colorado. Stick around, John and Savannah are up with our next guest.