 Hi everybody, we're back at GTC 2024 in San Jose, California, this is Dave Vellante, you're watching theCUBE. John Furrier is also in the house, we're broadcasting all day wall-to-wall coverage of GTC, Josh Patterson is here. He's the co-founder and CEO of Voltron Data. Welcome to theCUBE, thanks for coming on. Thank you for having me. All right, first question. Why did you found the company? Before Voltron Data, I was actually working on data processing for about five years at NVIDIA. I created the RAPIDS ecosystem with Keith Krause and Mike Nguyen and others from the team and we showed that GPUs were great at data processing, machine learning, graph analytics, and the rest of the data science ecosystem that wasn't deep learning, but it was difficult to use. A lot of the frameworks in data analytics weren't designed for GPUs and accelerated computing and so we were getting good performance but not great performance and we wanted to push the bounds a lot further. We wanted to make it simpler, we wanted to make it easier for enterprises to deploy GPUs at scale for data analytics and we also thought that data analysts could be significantly cheaper and faster and so we started Voltron Data, we brought in a few other companies over the years and really the rest is history. Ursa Computing, Blazor SQL guys, Territide, Xiland Data, most recently we acquired the Claypot AI team and we've just been laser focused on making it easier to build large scale distributive systems using NVIDIA's GPU. Thank you for that. So can we go through a little bit of history of the analytics business? I mean it kind of starts with, let's not go back too far but you had the Cognos cubes if you will and then Hadoop came along and we were very excited, the whole idea of bringing five megabytes of code to a big hunk of data but it was too complicated then Spark comes along, it does simplifies that with the Spark execution engine and Cloud obviously came in, Cloud databases, separating compute from storage and then boom AI comes. So you're seeing a lot of the analytics companies they go out, they buy AI companies, they're picking up talent, now the big, the mantra is bring the AI to the data. Okay, so how are you guys different? What are you able to do? You're obviously AI native if I can use that term or at least proximate to when- Accelerator native. Yeah, okay, there you go. That's the one we like to say. Accelerator computing native, at least proximate to the time frame. It's not like you were founded 15 years ago and you're having to retool. But so why does the world need you and versus those existing analytic systems? So when you think about the transition between Hadoop and Spark, Hadoop had this cold MapReduce paradigm. You would read data from disk, you would do some analysis, you would write to disk, read back from disk and you do this read write from disk. Map it, reduce it. Map it, reduce it. That's right. As far as it, let's keep it in memory as long as possible. You will spill graciously but it was really kind of moving us to enter this in-memory computing. You saw these 10 to 25X speedups going from this MapReduce paradigm to this in-memory paradigm is the same thing with GPUs. And so when we first started at NVIDIA, we were trying to figure out how to make NVIDIA GPUs better at data analytics. The first thing we said is, what if we could keep data in the GPU longer? Instead of using the GPU as a co-processor where we just pull data in to do computationally hard things, what if we did everything in the GPU? Decompression, decoding, reading CSVs, doing string parsing, regex engines, joins, group buys. And what if we used system memory kind of like how Spark would use disk? We would spill to that. Tear it. Tear it, exactly. And so we inverted the whole, you know, the thought process. Instead of using GPUs as a co-processor for computationally hard things, GPUs are the processor for everything. And what we realized is we unlocked a lot of speedups but then we couldn't feed the GPUs fast enough. There was still a lot of overhead with Java-based systems just because of the JVM and Garbage Collection, Python-based systems really couldn't keep up with the speeds and feeds that we needed. And so we really took a deep look and started back from the ground up with a C++-based distributed execution engine really optimized for full stack acceleration. GPUs, high-end networking like InfiniBand and Rocky, as well as using Flash Storage. I'm sure you all are familiar with all that WCA, DDN, Vastated, et cetera. And so when you fully leverage all these things in like GPU Direct Storage, UCX for really pushing InfiniBand, you get these really massive speedups. And so that's what we do as a company. So we focus on the data pipeline side of AI. How do we do ETL? How do we do feature engineering? How do we do data pre-processing at scale? So we can feed these large-scale machine learning AI, geospatial systems, routing optimization systems a lot faster and easier. So that's really the differentiation. Everyone's still trying to continue to do data processing on CPUs and both on GPUs for AI. And we're like, no, let's just move everything to the GPU so we can get this whole end-to-end acceleration. And if the workload is GPU friendly, if you will, then that works well. You don't want to use this for some older, general-purpose workload. That's fine. GPU would be too expensive, but you're unlocking the power of the GPU, utilizing it to a much greater degree, and exploiting the high-performance storage and high-performance networking in ways that are novel. Absolutely. And so to your point, you're like, use the GPU for what's important. This is honestly why we target problems that are 10 terabytes and above. Not how much data you have, but how much are you actually querying at any given time. And so at 10 terabyte queries, you really start to use a lot of CPU cores. And even at 10 terabytes doing TPCH, you plateau at 200 nodes, around 6,400 cores. Or even if you add more cores, more nodes, it doesn't get faster. And we can now perform a 200-node Spark cluster with two DGX-A100s. And so shrinking that back down, that 100x production in servers, not only is it faster, but it's also more energy-efficient, more space-efficient. And it allows us to scale to these 100 terabyte problems. And that's what we announced this week at GTC. Basically, these are just performance, our distributed query engine, at 100 terabytes, 30 terabytes, and 10 terabytes. Oh, awesome. Okay, so my alternative would be I could use the Spark execution engine. I suppose I could use Elastic MapReduce in the cloud. I could go back 10 years and try to redo Hadoop, but none of that would make sense. Okay, so you're solving a problem that is large enough and just economics make sense to do it with full-time data. It's large enough to merit a GPU. And it's important. When you typically have 50 terabyte, 100 terabyte, 200 terabyte queries, probably doing something important. And so, yes, that's a great time to use GPUs. Not only does it free up your CPU cluster for these more BI-like workloads, it also feeds into these AI systems a lot faster. People can iterate their model training faster. You know, one of my favorite stories is back in 2020, the NVTabular team at NVIDIA, they showed at the Rexxas competition that doing more feature engineering faster, being able to iterate feature engineering faster, actually produce better models than throwing fancy deep learning models at the problem. And so, the more we can do feature engineering quickly, the actually the better business insights people can get and save money. Okay, so let's get into, I want to understand the product better, but let's start with sort of the use case. So, what's a good representative use case? Let's start there and then I want to say, okay, that's my use case. How do I get started? What do I have to do? What are the deployment parameters, et cetera? So, one of the simplest use cases is really in retail, forecasting. You have a lot of goods. You have terabytes of data of what was bought last week, the week before that, the week before that. You want to apply some type of decay factor. More recent purchases have a higher weight than past purchases. And then you want to just basically do a bunch of feature engineering and ETL to get that ready for some type of AI or ML model, whether it's PyTorch or XGBoos, you still have to do tabular preprocessing. I want to join different data sets together. I want to merge in distributions in our data. I want to merge in weather data, holidays, all these different things to build this massive corpus. And then you want to do feature engineering. So really nothing different than what Spark, Presto, Trino do today. But when you're a retailer, if you're a small retailer, that might be a four or five terabyte query. If you're a Walmart, a Target, a Home Depot, that's dozens of terabytes. And so when you get to these really large queries, the faster you can preprocess them in a myriad of different ways, you can build better forecasts, and better forecast just means you save more money. So you are a query executioner, a query acceleration engine. Is that right? Is that the right way to think of it? In other words, I leave the data where it is. I bring the compute to the data. Absolutely. So we are an accelerated query engine. It's Kubernetes native, cloud native. So essentially it's all infimeral, it's containers. We have, we spin up a cluster in seconds, and we can pull raw data directly from network attached storage into our engine. We can basically fully utilize and thinnaban and pull data in at line rate and just analyze the data. Just do what you would normally do. Joins, group buys, filters, aggregations, and then we write that data back to other open formats, whether it's Parquet, Ork, Averro, JSON, or you can just push it through Arrow Flight to directly into a machine learning system. But most, I'm trying to think, I mean, I think about most analytic data platforms are very limited in terms of the number of complex joins they can do. Is that a problem that you solve necessarily? They're limited by performance and time, primarily. And when you take something from ours and you can now run it in seconds, you can do a lot more joins, you can do a lot more group buys, you can do a lot more complexity. It's probably the best way to say it. And so, absolutely. We, with just our brute force performance and speed, it allows people to do more things quickly. So what do I buy from you and how do I deploy it? So what you buy from us is access to our software. It's an enterprise software license model. We license our software, you get our containers, you spin them up on your community's distribution, and you're off to the races. It's really simple. We can tie into existing logging, authentication, security systems that you might have within your environment. We are not selling you a sass. We're selling software. We don't want to own your data. Data has gravity, data is expensive. We want people to just have this infimeral compute that runs on their NVIDIA GPUs to fill those underutilized cycles with really, really fast data analytics. We also license it through partners. And so, we partner with HPE. We're working on other partnerships today where they can embed our engine directly in their product. And so, seamlessly, without even knowing you're hitting a GPU, people can write code as they would have done normally, and then it will target DCUs and then pull data from your data lake, and you're off to the races, and you just are happy that your queries are coming back faster and cheaper. So, you charge me an ELA, or you charge me per query, or how do you charge? Our pricing model right now is unlimited license model. When you're talking to people about hundreds of terabytes, they don't really want to meet her. A lot of our early customers are in DoD and in the intelligence space, they don't want to monitor. And so, we went with the unlimited pricing model. It's your data center, it's your hardware, it's your code. We're just going to make it faster and give you a better engine to run on. So, I write you a check, I get your software, and I can use it, and then how do you keep it up to date, and it's not a SaaS, right? But so, what do I do? I come back every couple of years, you ship me updates. We have a private forum for people to file bugs. We have new releases every six to eight weeks. We add new features, new functionality. So, it's really kind of going back to just a traditional database model with software. And what's the vision for the company? Oh, that's a great question. We believe that this engine is going to be so powerful that the direct users are going to be prevalent, but really it's what are people going to do with the engine in their own products? We are really a partner-first company. We want people to embed the ECS into their products. We are actively talking to a few different SEM companies, Cyber Security Incident Event Management Systems. We would love for the ECS to be powering the next generation of SEMs. We want people to start building more industry-specific platforms on ECS, whether it's genomics, any money laundering. We really want ECS to be the backbone of data analytics at scale. And so, regardless of the industry, regardless of where you deploy it, whether it's cloud, on-premise, air-gapped, when you're experiencing these large-scale 30 terabyte or above queries, we want people using NVIDIA GPUs and theses. And what's your funding look like? Where are you at? We raised about a hundred and ten million across our seed in series A. Nice. We did that in early 2021 and 2022. Oh wow, okay, you got it right in under the wire. Well done. Thank you. So we're pretty well funded, and it's expensive to build a data engine. I mean, anyone who's built a database or a query engine, it's a lot of work, a lot of effort. And I'm actually super impressed with how fast the team built it, how large it scales, and in the time we've done it. You know, Jensen was talking in his keynote, he must have used the word digital twin, I don't know, 10 or 15 times. That whole idea of a digital representation of your business, I think about, you know, analytic systems today are largely historical systems of truth. That's assuming you can get the data correct. Do you see a day where you start to bring in transactions and you actually can have a real time digital twin of your business, people, places, and things? Is that something that you think is technically feasible in the near to midterm? Are you asking, is H-Tab going to happen? Yes. Well, more than H-Tab, right? I mean, I was joking a little bit, but yes. I actually think that's where we're going. And so the world is a combination of real time data and then using batch processing to power other things which can simulate the possibilities. And so you kind of need to be this one step ahead if you're going to do a digital twin across a permutation of different outcomes. And so as data's coming into a system and as a system's changing, you need to be able to record that and allow that to manipulate data as well as using that data to kind of predict what's the next step out. And so right now, the systems are kind of disjointed. The actual, you know, how the world is moving data is very different than how do we change the environment? How does this impact other things around it graciously? And so being able to move transactional systems and analytic systems closer together will allow us to do much more sophisticated digital twins, faster, you know, discovering these types of environments. And so one of the things about being able to do analytics very quickly on large scale, we can actually start to marry these systems together. And we even shown that we can basically read no SQL and other non-columner formats just as fast as we can recombiner formats just by using all the same hardware bypassing and speed of GPUs. And so I actually believe that, you know, very soon we're going to start to see these, you know, really elegant solutions that are quite simple. One data store, both operational data and analytic data powering both AI and, you know, simulations. And when Josh was asking me about what HTAP's going to happen, he's talking about hybrid transaction analytic, you know, processing, which, you know, you could say it's kind of here today where people just sort of think about my SQL heat wave. They sort of got an analytic engine and they got a transaction engine as a big honking memory. But really talking about a semantic layer where you can make all those different data types that you were just talking about coherent. You could do large scale joins and it starts to get down to sort of rethinking how you even lay out data on a disk or a flash, really taking things that databases understand, think of them as strings and turning them into things that humans understand like things, people, places and things. And, you know, you think about the natural language processing and the AI era. That's a vision that we'd love to see happen. We just don't know if it's technically feasible in, you know, this decade. But you think it is. Performance makes all things possible. And money. But I mean, I think that's the media story. We don't have enough money, so it's like we have. We keep making GPUs faster every generation and the faster the GPUs get, the better AI gets, the better inferencing gets. We start to really pull down this time constraint. And so these systems start to kind of, you know, evolve graciously just out of pure performance and speed. And so all the things you were saying are exactly right. But if I can take data in a format that might not be ideal, but I can just read it in so fast and manipulate it so fast and then convert it to a new format at real-time speed, these lines just get very blurred quickly. Imagine what that does to your supply chain and drug discovery. I mean, there's so many use cases. Routing optimization. Right, absolutely. All right, give us the pitch for your company. Give us a little elevator pitch, you know, as before we exit here. Sure, Voltron Data is the leading designer and builder of data systems. We want to make the next generation of data systems as efficient as the systems today. Data's growing. Data will be 10 times larger, you know, in five years. We want it to be easier than it is today to build these systems. Josh, fantastic. Thank you so much for coming on theCUBE. It was really a pleasure having you. Thank you. All right, and good luck. All right, and keep it right there, everybody. This is Dave Vellante. John Furrier is also here. GTC 2024 from San Jose. You're watching theCUBE.