 I'm Rob Streche, managing analyst with theCUBE Research and today we're going to talk to Ed Walsh, CEO and Thomas Hazel, CTO and founder from Chaos Search. Welcome back to theCUBE, to both of you and really excited to have you both here today. Thanks for having us, it's always great. Yeah, and especially because this is a launch day. So I think this, you know, it's always a little bit more exciting when there's fresh stuff to talk about. So let's kind of dive into that. What is Chaos Search announcing today? And you know, how does that really change what you guys do? I'll take the first shot. So listen, today we're launching Chaos Lake DB. So it's a data lake database. In fact, it's been the foundation of our SaaS based service for years. And I'll go through in a little bit, but we're announcing that it's also available, the Chaos Lake DB as an embedded database for service providers. You take your standard cloud storage and we transform it into a hot, analytic database. And now we didn't say we put a database on top or ETL out of S3. What we do is we make it actually a hot, analytic database. And then we provide rich analytics. So think about Elasticsearch type of analytics or Elasticsearch API or SQL analytics. And then of course, GenAI on top of it. So you simply stream your data into any of your objects towards around the world and it's available immediately for hot analytics. We deal really well with the operation use cases, the business use cases. Think about operations use, the cases would be, think about observability or security lake use cases. But also what you do, business use case. All that data is the digital footprints of everything going on an application that's really important to the business side. So our end users end up having the capability of having literally to save time, cost and complexity. They free up their time from managing all these complex pipelines, managing these different clusters. They unify these use cases, but also they save costs. So we're saving 50 to 80% hard dollars, while giving people a lot more attention in which is what they're really trying to do in their environment. And we'll dive into that a little bit later because I think that again, getting through this and a little bit of a data geek myself. So I think getting down into why Lake DB is different and we've got Thomas here who can, we can both geek out a little bit on this because I think it really to me, having seen a lot of different database and lake house announcements, bringing the two together, what was the thinking behind that? Because I think there's extreme advantages. And I know we were talking about things like ingest and others, let's talk a little bit about that because I think that's really key for people to understand. Well, back in 2016, the idea of transforming your object storage into a high performance, at scale, real time database seemed a little strange, odd. Now it makes a lot of sense, but the idea was that's where all the data was going to because they're worried about their pipelines failing, their databases failing, so they're streaming into services like S3. And so I thought, well, what if I could transform that data where it lives into a search, a sequel at scale database? You're thinking, not small day, we're talking billions of trillions of events, we're talking about terabytes to petabytes, but do it in a way that is cost effective and simple. So we're a database with a lake philosophy and that's the big change, that's the big difference where a lot of times you think about data pipelines, ETL and schema definitions, no, stream it into your lake of choice. We auto discover the schema, we index it without you having to worry about it, but we publish well known APIs like Elasticsearch API, no Elasticsearch in the hood and all that pain, or the SQL API with Trino as our export. And then this Gen AI, conversations with your data. It's a one, two, three punch. Yeah, and I think that's important because it becomes part of a strategy for an organization and it can be embedded underneath. It's, you're not just selling chaos search anymore, you're selling actually the Lake DB underpinnings, which you guys have spent years, I mean, again, this is years in the making, but it's not a V1, this is V whatever, six years later, seven years later, right? So it's been fully tested out and vetted at scale. And if you think of why these service providers or platform providers are looking for this technology, it's pretty simple. It's kind of architectural simplicity, but you're also, they're innovating on the application layer or the model layer, especially with Gen AI, we see a lot of model innovation, and we're innovating on the data layer, right? So a lot of people, they're integrating what they can do on the data layer, they're not innovating on it. So we're a natural fifth for them. So what they get is they're able to give their customers, hey, I can give you the cost-effective attention you're looking for. And they're looking for a lot more attention for a lot longer, and it's hard to do if you don't have this innovation, but also allows you the architectural simplicity without doing a whole bunch of tiering. Some of these are five or six different types of tiering to get the cost-effectiveness, even then they don't get the capacity, but it's really the cogs. So we talk about this cost of goods, but for these big service providers or cloud service providers, let's say they have a 60% margin business, you just do quick math and it, that's a good margin, maybe not world-class, but literally, if we can take that 40% cost of goods and in these platforms, 80% plus, 80 and 90% of that cost is actually the platform itself, the network, the compute, the storage to do all the analytics, if we can free up 58%, literally they're seeing their margins go from 60% to 76%, 86%, and that's like game-changing, and that's like, okay, that's okay, they're making more money, I get it, no, no, it's billions of dollars in valuation, which changes overnight. Yeah, and I think in Gen AI's case in particular, you're seeing a lot of people saying, I want to have this LLM or as we would call them SLMs, a segmented language model over here, and then I want this one over here, and I want this one over here, so maybe this is for the CFO's organization, maybe this is for the marketing's organization, where they're trying to keep it private, so I think, again, you hit on it a little bit briefly, but being able to be anywhere in the world with the data being localized so that you can meet those regulations like GDPR and a number of the different European ones and some of that's coming to Canada and I think it's PIPA up in Canada and all of these things that really are key, but how does that affect the latencies and how does that affect the performance, because that's the key? I mean, everybody wants their data and object storage like S3, it's cost-effective, secure, durable, available all over the world, however, it's slow, or at least thought of it as being slow, it's thought of as an ETL location or archival, well, the audacity of us, we have academic papers on our website to take a look, but we had the audacity to come up with new computer science, new architecture that allowed that first read to be high performance, and that's something that with existing technologies like Lucene and Parquet, it's really hard to do, but what we did was we created a new representation, a new architecture that allows large-scale search, large-scale relational joints, and do it in a way that it's just automated like a lake would wanna be, but do it in a high-performance way without that tiering, without that cost, and that's how we're uniquely priced, and we can talk about pricing later, but the idea that when you're dealing with that type of scale, I mean, everything's expensive, and we uniquely drop that price because of the innovation, because of that low latency, chaos, Lake DB. And I'll give you an example of that, so land your data anywhere of any of your S3 locations or GCS locations around the world, and it's in your dashboard and in the world in a minute. It's amazing what we can do. Now, we are leveraging, and in fact, the GM of Amazon S3 gave us a really good quote for the announcement, because it's all, it's the attributes. We pick up the attributes, our attributes for a database are based on the attributes of S3. Think about all the durability, the security, the availability of those solution sets, land it, and it's available immediately, so you pick up all those attributes, but literally land anyone in the world and it's in your dashboard in a minute, and queries are in seconds. We don't have provision storage, right? It is a service. Think about, we have customers go from, let's say Black Friday, 50 terabytes of ingestion per day, up to 500. We don't have to provision that storage. We just have elastic compute that scales up, scales out, or scales down. And the innovation is now to make that fast, right? So, and that's where the six patents go. Anyway, so this is why we're excited about this. Yeah, I think that's a good jumping off point, because I mean, customers want to understand what's in it for them, and I think you have some pretty big customers who really have, you know, really leaned into Lake DB, and one of them in particular was Equifax, who just released a white paper with you, in which they don't take lightly on that stuff, having been on that side of the fence. So, what was the story with Equifax, and how are they using it? Yeah, so if you go across, I'll jump, and then you jump in, listen, I think if you go across all our customers, the story is very consistent across all of them, but I'll talk about Equifax specifically, but if you look at Clarner, Cloud and Pure Games, Cisco, Equifax consistent, what they do is they look at a unified data, like they bring things together, and they're able to do a lot of consolidations, but also automation. So, what Equifax did is, again, it's similar, they are able to save, in fact, they're after two years in production, and it is hard to get these things in writing, but a 90% hard dollar savings, 90% savings on what they're doing, but they went across multiple divisions, and they're running across both Amazon and Google. They're isolating their data where they need to isolate it in different geographic areas for different business units, but they're able to, again, you aggregate and consolidate all that data to one platform, and then what you're doing with the platform we have, you give stateless compute to different groups to spin up, they're not conflicting with each other to use this one unified data lake for both operational business. So they're using it for operational use cases, observability, or they're all using for security to the lake as well, and then they always have the business side that wants to get act after that data sets and go after. So their case study is actually on the website. I love the quote from Jeff Kinshurf, ran worldwide SREs, he's gotten a big promotion sense, but again, a great executive, but he looked at technology and said, listen, I don't think about where I'm landing data, it's just there to query, which if you think about this, you know, think about all things that are leaving out. And database worlds, that doesn't have to happen. It's not the pipeline, the pipeline to that data lake, and that data lake, and oh, I have to replicate it from here to here. That's all taken care of in the platform, largely based upon this fabric of this object storage. The other one he says, also I looked at disparate solutions around analytics, and to bring it back together, I save a lot of time and complexity. So you'll see that pretty consistently, but yeah, Equifax is a great customer, it's a great customer study, it's right on the website as well. And when we started talking to them early on, now this is many years ago, they were moving to the cloud, and they had a conceptual architecture that they want to have to build out their business units. And we were the only company that had that vision of transforming your data lake into this, you know, elastic analytical fabric that allows business users to quickly ease, and that was the key thing. I want a new stream of data to be available within the day, not weeks of months of toil that typically goes on. Yeah, I think that to me is one of the biggest things is the ingestion is always a problem. There's people who've written, hey, we're going to land it in S3, and then we have an ODBC driver that we've gone and open source built, and now what's the scalability of that? What's the complexity to that? This is all gone, right, with what you're doing. Well, not only do you have to make obvious storage fast, right, that's hard enough, but the idea that if I could have a data lake philosophy, we know what lakes are great, just stream in opaque data, and then worry about it later, but that takes time, that takes cost and complexity. We had the DASI again to say, what if we auto discover the data stream, fully index it, and allow you to publish views, virtual views that can be search or SQL, and if you did that, think about all the complexity of setting up a pipeline, the complexity of setting your schema. Now, with our technology, you can virtually change the schema after it's been ingested. You don't have to worry about re-indexing, particularly at petabyte scale, and so solving that problem of stream it in and immediately get notified to index it and make it available, that's something that no one does, and particularly in this lake warehousing viewpoint, that's something that, it's more of a warehouse than a database. With this live ingestion, we're more like a database than, say, a lake could ever be. Right. And if you think about it, the other thing about this, the pipeline, it's a lot of time and energy and everyone wants to avoid it, but also we're taking close to raw, give it to as raw as you can, and we're indexing that fully, it allows you to later, the pipeline naturally, you throw out data, you're really trying to put unstructured data into structured schema. You are really throwing, you're deciding in advance what you're not going to keep. And then, of course, if you don't construct, they're not even keeping whole lineage of log types because they just can't afford it with us, constructively do it, but we take it as raw as you want, nested JSON type of artifacts, and then later, you can go through it in any level detail. Now, once you unleash things like GenAI on it, now to have all that data in one unified repository, and in its full nature, not something that's been, you know, parsed down as small as you can, because you can't afford it otherwise, that's where the power really comes out. And that's where we see the, GenAI is going to rally cry of, you really can use and you should have all your data points. And you want schema on reflexability that we, the lakes promise, but it was slow. Schema on reflexability with schema on right performance. And that's what we're trying to bring to the market. And I think again, the having the GenAI on top of it, also with, I mean, let's put it mildly, we've talked about it, budgets are not getting any bigger. How are you going to understand how to use the data and make the data usable? Having that GenAI as that co-pilot-ish thing that can help you find out and see the insights in your data, because it's beyond what you can, as humans can think of, it sees relations. It'll make those people who are the DBAs better at what they do help them come up with better queries. That's where you're going with that, right? You know, and we jumped in really early on this GenAI path that, now it's all over, right? And the idea that, now we don't share data with the LM, particularly public LMS, but we use the LM as a reasoning engine to help us ask intelligent questions on our database. And then you can think about it. You may say, I'm a security analyst. How do I discover failed logins over the last five years? And having that conversation without necessarily being a security expert, or understanding SQL, or understanding looker or tableau type controls, having that conversation to provide those answers with the LM, with our database, it's a great combo. It's definitely a rally cry in the industry. And if you haven't really looked at what, you know, these tools can do for you. And we do see it going from, our first move was using the public model in a safe way, but we see people now, it's our ability to orchestrate between the application, the model innovation, other people doing our sewer powers, getting all this data cost effectively together. But then what we do is you bring these things in. So if you use a public model, we allow you to use it without sharing anything with a public model, but also at scale. Well, as we see people using different private models, that's the innovation and integrating that in orchestrating that is a natural act of what we do in the backend. And that's why having search and SQL, you know, there are certain questions you need to ask that, you know, one type of database really can't solve. And so through train of thought, unique prompting that we do, you can ask search and SQL questions to really derive the value that you're trying to seek out. And, you know, no flu stations because we don't ask the LM for the answer. Yeah, no, that makes total sense. So, you know, great announcement today and things are coming. Where should people go to find more out about this? Yeah, come to ks-search.com. I think a couple of things I'd point out is you'll see a lot of the information about the Lake DB. Thomas mentioned this white paper. We actually took the original academic white paper about technology. We dusted off, put a cover sheet on it, but for those that really want to go deep on what we did and how it's different, architecturally different, you also see how it's not a simple tweak of existing architectures. They also have a good product white paper that goes through it as well. And then of course the Equifax case studies right in the website. Yeah, I think that's key. I think having read a lot of these white papers myself, I think you're getting into the nuts and bolts of what's going on, especially for data teams and what we see as data developers, the development of the data developer who's building the applications on top of the data, being able to understand again to have search, have SQL and Gen AI all in one place across your data is pretty powerful to put it mildly. So I want to thank you both for coming on board and sharing this with us because I think again, it's a lot of fun. We're looking and always trying to bring things to organizations that can help them really get a better feel for how their data can be used and how they can be more effective and cost effective in that use as well. So thank you very much for coming by. Yeah, thank you. Thank you. All right, thank you. And for here and understanding and taking advantage of this news on Lake DB's capabilities and with the launch of Lake DB, we can't wait to hear more as this evolves and the technology continues to evolve. I want to thank you all for watching and stay tuned.