 Hi everybody, this is Dave Vellante. Welcome back to SuperCloud 2. Last August at the first SuperCloud event, we invited the broader community to help further define SuperCloud. We assessed its viability and identified the critical elements and deployment models of the concept. The objectives here at SuperCloud 2 are first of all to continue to tighten and test the concept. The second is we want to get real world input from practitioners on the problems that they're facing and the viability of SuperCloud in terms of applying it to their business. So on the program, we got companies like Walmart, Saks, Western Union, Iona's Pharmaceutical, NASDAQ and others. And the third thing that we want to do is we want to drill into the intersection of cloud and data to project what the future looks like in the context of SuperCloud. So in this segment, we want to explore the concept of data architectures and what's going to be required for SuperCloud. And I'm pleased to welcome one of our SuperCloud sponsors, Chaos Search. Ed Walsh is the CEO of the company with Thomas Hazel, who's the founder, CTO and chief scientist. Guys, good to see you again. Thanks for coming into our Marlboro studio. Always great. Great to be here. Okay, so there's a little debate. I'm going to put you right in the spot. There's a little debate going on in the community. It started by Bob Mugley, a former CEO of Snowflake and he was at Microsoft for a long time. And he looked at the SuperCloud definition and said, I think you need to tighten it up a little bit. So here's what he came up with. He said a SuperCloud is a platform that provides a programmatically consistent set of services hosted on heterogeneous cloud providers. So he's calling it a platform, not an architecture, which was kind of interesting. And so presumably the platform owner is going to be responsible for the architecture. But Dr. Nehlu Mihai, who's a computer scientist behind the cloud of cloud project, he chimed in and responded with the following. He said, cloud is a programming paradigm supporting the entire lifecycle of applications with data and logic natively distributed. SuperCloud is an open architecture that integrates heterogeneous clouds in an agnostic manner. So, Ted, words matter. Is this an architecture or is it a platform? Put us on the spot. So I'm sure you have your concepts. I would say it's an architecture or design principle. I look at SuperCloud as a megatrend, just like cloud, just like data analytics. And some companies are using the principle, design principles, to literally get dramatically everywhere else. I mean, things you couldn't possibly do if you didn't use cloud principles, right? So I think it's a SuperCloud effect. You're able to do things you're not able to do. So I think it's more a design principle, but if you do it right, you get dramatic effect as far as customer value. So the conversation that we were having with Muglia and Tristan Handy of DBT Labs was I'll set it up as the following and Thomas looked at your thoughts. If you have a CRM, think about applications today. It's all about forms and codifying business processes. You type a bunch of stuff into Salesforce and all the sales people do it in it. This machine generates a forecast. What if you have this new type of data app that pulls data from the transaction system, the e-commerce, the supply chain, the partner ecosystem, et cetera. And then without humans actually comes up with a plan. That's their vision. And in Muglia was saying, in order to do that, you need to rethink data architectures and database architecture specifically. You need to get down to the level of how the data is stored on the disk. What are your thoughts on that? Well, first of all, I'm going to cop out. I think it's actually bold. I do think it's a design principle. I think it's not open technology, but open APIs, open access. And you can build a platform on that design principle architecture. Now, I'm a database person. I love solving database problems. I wait for your launch. Yeah, so I mean, Snowflake is a database, right? It's a distributed database. And we wanted to crack those codes because multi-region, multi-cloud customers wanted access to their data. And their data is in a variety of forms, all these services that you talked about. And so what I saw as a core principle was cloud office storage. Everyone streams their data to cloud office storage. From there, we said, well, how about we rethink a database architecture? We think file format so that we can take each one of these services and bring them together, whether distributed or centrally, such that customers can access and get answers, whether it's operational data, whether it's business data, AKA search or SQL, complex distributed joins. But we had to rethink the architecture. I'd like to say we're not a first generation or a second. We're a third generation distributed database on pure, pure cloud office storage. No caching, no SSDs, why? Because all that availability, the cost of time, is a struggle in cloud office storage we think is the answer. When you say no caching, so I think about how companies are solving some pretty hairy problems. Take MySQL Heatwave. Everybody thought Oracle was going to just forget about MySQL, well, they come out with Heatwave. And the way they solve problems when you see their benchmarks against Amazon, oh, we crush everybody, is they put it all in memory. So you said no caching, you're not getting performance through caching, is that true and how are you getting performance? So five, six years ago, when you realized that cloud office storage is going to be everywhere and it's going to be a core foundational, if you will, fabric, what would you do? Well, a lot of times the second generation say we'll take it out of cloud storage, put in SSDs or something and put into cache and that adds a lot of time, adds a lot of cost. But I said, what if, what if we could actually make the first read hot? The first read distributed joins and searching and so what we went out to do was said we can't cache because that adds time, that adds cost. We have to make cloud office storage high-performance like it feels like a caching SSD. That's where our patents are, that's where our technology is and we've spent many years working towards it. So to me, if you can crack that code, a lot of these issues we're talking about, multi-region, multi-cloud, different services, everybody wants to send their data to the data lake, but then they move it out. We said keep it right there. You're nail it, the data gravity, so Bob's right, the data's come in and you need to get the data from everywhere, but you need an environment that you can deal with all that different schema, all the different type of technology, but also add scale. Bob's right, you cannot use memory or SSDs to cache that, that doesn't scale, it doesn't scale cost effectively, but if you could, but what you did is you made object storage, S3 first, but object storage, the only persistence by doing that, and then we get performance, we should talk about it, it's literally, hundreds of terabytes queries that it's done in seconds is done without memory caching. We have concepts of caching, but the only caching, the only persistence is actually, when we're doing caching, we're just keeping another side-dye track of things on the S3 itself. So we're using the, actually the object storage to be a database, which is kind of where Bob was saying, we agree, but that's what you started out, people thought you're crazy. And make it live. Don't think of it as an archival or temporary space. Make it live, real-time streaming, operational data. What we do is make it smart. We see the data coming in, we uniquely index such that you can get your use cases that are search, observability, security, or back-end operational, but we don't have to have this, I don't know, static, fixed, silo type of architecture technologies that were traditionally built prior to super cloud thinking. And you don't have to move everything essentially. You can do it wherever the data lands, whatever cloud across the globe. You're able to bring it together. You get the cost effectiveness because the only persistence is the cheapest storage persistent layer you can buy. But the key thing is you correct it. We add across the codes, right? That was the key thing. That's where the pants are. And then that's, well, once you do that, then everything else gets easier to scale your architecture across regions, across cloud. Now, it's a general purpose database, as Bob was saying, but we use that database to solve a particular issue which are on the operational data, right? So it, but it's, we agree with Bob's. Interesting. This brings me to this concept of data mesh. Jim Octogonny is one of our speakers. You know, we talk about data fabric, which is a NetApp, originally NetApp concept gardeners kind of co-opted it. But so the basic concept is data lives everywhere, whether it's an S3 bucket or a SQL database or a data lake, it's just a node on the data mesh. So in your view, how does this fit in with super cloud? And you've said that you've built essentially an abler for that, for the data mesh. I think you're an abler for the super cloud-like principles. This is a big chewy opportunity and it requires, you know, a team approach. There's got to be an ecosystem. It's not going to be one super cloud to rule them all. So where does the ecosystem fit into the discussion and where do you fit into the ecosystem? Right. So we agree completely. There's not one super cloud effect. What we use super cloud principles to build our platform and then what you're going to, you know, the ecosystem can be built on leveraging what everyone else's secret powers are, right? So our power, when it comes to our super power based upon what we built is we deal with, if you're having any scale or cost-effective scale issues with data, machine-generated data like business, you know, observability or security data, we are your force multiplier. We will take that in singular. Just land it, simply put it in your object storage wherever it sits and we give you uniformity access to that using open API access, SQL or, you know, elastic search API. So that's what we do. That's our super power. So I'll play it into data mesh. So it's a perfect, we're a node on a data mesh, but I'll play in the super how the ecosystem, we see it kind of play out. We talked about it in just the last couple of days, how we see this kind of possibly short-term, our superpowers, we deal with this data that's coming at these environments. People, customers building out observability or security environments or vendors that are signing their own super cloud. I do observability, the data dogs of the world, dot, dot, dot, the, the, the splunks of the world, dot, dot, and security. So what we do is we fit in naturally. What we do is a cost-effective scale just land it anywhere in the world. We deal with ingest and it's a cost-effective order of magnitude or two or three, order of magnitude is more cost-effective, allows them, their customers are asking them to do the impossible. Give me fast monitoring and alerting. I want it snappy, but I want to keep two years of data. And I want it cost-effective, it doesn't work. They're good at the fast monitoring and alerting. We're good at the long-term retention. And yeah, there's some gray area between those two, but one-to-one is actually cheaper. So we would partner. So the first ecosystem plays, who wants to have the ability to, all the data is in those same environments that the security observability players, they can really just through API, drag our data into their photograph. We can make it seamless for customers. Right now, we make it helpful to customers. You data dog, we make a button, easy to go from data dog to us for logs, save you money, same thing with Grafana or, but you can also look at its ecosystem. Those same vendors, we can, it used to be a year ago, it was, you know, it's all about, how can you grow? Like it's growth at all costs. Now it's about cogs. So literally we can go in an environment, you supply what the customer wants, but we can have with cogs. And one-to-one in a partnership is better than you try and build on your own. Tom, as you were saying, you make the first read fast. So you think about Snowflake. Everybody wants to talk about Snowflake and Databricks. So Snowflake, great, but you got to get the data in there. All right, so that's a, can you help with that problem? I mean, we want simple end, right? And if you have to have structure and you're not simple. So the idea that you have a simple end, data lake, schema, retype, philosophy, but schema right type performance. And so what I wanted to do, what we have done is have that simple lake and stream that data real time. And those access points of search or SQL to go after whatever business case you need, security, observability, warehouse integration. But the key thing is, how do I make that click, click, click answer and do it quickly? And so what we want to do is that first read has to be fast. Why? Because then you can do all the siloing layers, complexity. If your first read is not fast, you're at a disadvantage, particularly in costs and nobody says I want less data, but everyone has to, whether they say we're going to shorten the window, we're going to use AI to choose, but in a security moment, when you don't have that answer, you're in trouble. And that's why we're of this service, this super cloud service, if you will, providing access, well known search, well known SQL type access that if you just have one access point, you're at a disadvantage. You actually talked about Snowflake and BigQuery and all the different platform data bricks. That's kind of where we see the phase two of ecosystem. One is easy, the low angle fruit is observability and security firms. But the next one is what we do, our super power is done with this messy data that's schemas changing like night and day. Pipelines are tough and it's changing all the time, but you want these things fast and it's big data around the world. That's the next point, just use us alongside or inside one of their platforms and now we get the best of all the worlds. Our super power is keeping this messy data as a streaming, not a batch thing, allow you to do that. So that's the second one. And then to be honest, the third one, which plays to your super cloud, it also plays perfectly the data mesh is if you really go to the ultimate thing, what we have done is made object storage, S3, GCS and BlobStore, we made it a database. Put, get, complex query with big joints. So back to the original thing, Muglia Teta, perfectly, we've done that. Now imagine if that's an ecosystem, who would want that if it's again, it's uniform available across all the regions, across all the clouds, and it's right next to where you're building a service or a client's try it. That's where the ecosystem, I think people are going to use super clouds for their super powers. We're really good at this, allows to have short term. I think the snowflakes and the data bricks are the medium term, you know, and then I think eventually it gets to, hey, listen, if you can make object storage fast, you can just go after it with simple SQL queries or elastic, who would want that? Yeah, I think that's where people are going to leverage. It's not going to be one super cloud, it's how do I leverage the super clouds? Our people in smart object storage can be programmable and so we agree with Bob, but we're not saying do it here, do it here. This core fundamental layer across regions, across clouds that everyone has, simple in. Right now it's hard to get data in for access for analysis. So we said, simple in will automate the entire process, give you API access across regions, across clouds. And again, how do you do a distributed join that's fast? How do you do a distributed join that doesn't cost you an armored leg? And how do you do it at scale? And that's where we've been focused. So prior to the cloud object store was a niche. S3 obviously changed that. How standard is essentially object store across the different cloud platforms? Is that a problem for you? Is that an easy thing to solve? Let's talk about, I mean, fundamentally. Yeah, we was tracking, but fundamentally, cloud object storage, put, get, and list. That's why it's so scalable because it doesn't have all these other components. Those complexity is where we have moved up and provide direct analytical API access. So because of its simplicity and costs and security and reliability, it can scale naturally across, I mean, really distributed object storage is easy. It's put, get, anywhere. Now, what we've done is we put a layer of intelligence, you know, call smart object storage where access is simple. So whether it's multi-region, do a query cross or multi-cloud, do a query cross or hunting, searching. We've had clients doing Amazon and Google. We have some Azure, but more we see Amazon and Google more. And it's a consistent service across all of them. Just literally put your data in the bucket of choice or folder of choice, click a couple of buttons. Slowly click that to say that's hot and after that it's hot. You can see it, but we're not moving data. There's didactic gravity issue. That's the other bad. It's already natively flowing to these pools of object storage across different regions and clouds. We don't move it. We index it right there. We're spinning up stateless compute back to the super cloud concept. But now that allows us to do all these other things, right? So it's no longer just cheap and deep object storage. It's the same, like you have an antelope platform regardless of where you're at. You don't have to worry about that. Yeah, we deal with that. We deal with the stateless compute coming up. And make a programable. Be able to say I want this bucket to provide these answers, right? That's really the hope, the vision and the complexity to build the entire stack and then connect them together. We said the fabric is cloud storage. We just provide the intelligence of time. Let's bring it back to the customers. And one of the things we're exploring in super cloud too is super cloud a solution looking for a problem. Is a multi-cloud really a problem? I mean, you hear a lot of the vendor marketing says, oh, it's a disaster because it's all different across the clouds. And I talked to a lot of customers even as part of super cloud too. They're like, well, I solved that problem by just going monocloud. Well, but then you're not able to take advantage of a lot of the capabilities and the primitives that you like Google's data or you like Microsoft's simplicity or their RPA, whatever it is. So what are customers telling you what are their near-term problems that they're trying to solve today? And how are they thinking about the future? Listen, it's a real problem. I think I started, I think this is a mega trend just like cloud, you just like that cloud data. And I always had analytics or the mega trends. If you're looking at those, if you're not considering using the super cloud principles and I was leveraging what I have abstracting it out and getting the most out of that and then build value on top. I think you're not gonna be able to keep up. In fact, you know where you're gonna keep up with this data volume. It's a geometric challenge and you're trying to do linear things. So clients aren't necessarily asking, hey, first super cloud, but they are really saying, I need to have a better mechanism to simplify this and get value across it. And how do you abstract that out to do that? And that's where they're obviously, our conversations are more amazed what we're able to do and what they're able to do with our platform because if you're thinking about what we're done, the S3 or GCS or object storage is, they're trying to, they can't imagine the ingest. They can't imagine how easy, time to glass, one minute, no matter where it lands in the world, querying this in seconds for hundreds of terabytes. People are amazed, but that's kind of, so they're not asking for that, but they are amazed. And then when you start talking out, if you're an enterprise person, you're building a big cloud data platform or doing a data or analytics, if you're not trying to leverage the public clouds and somehow leverage all of them and then build on top, then I think you're missing it. So they might not be asking for it, but they're doing. And they're looking for a lens. You mentioned all these different services. How do I bring those together quickly? You know, our viewpoint, our service, is I have all these streams of data, create a lens where they want to go after it via search, go after via SQL, bring them together instantly, no detailing out, no define this table, put into this database. We said, let's have this service that creates a lens across all these streams and then make those connections. I want to take my CRM with my Google ad words and maybe my Salesforce. How do I do analysis? Maybe I want to hunt first. Maybe I want to join. Maybe I want to add another stream to it. So our viewpoint is it's so natural to get into these Lake platforms and then provide lenses to get that access. And they don't want it separate. They don't want something different here and different there. They want to be. This is our industry, right? Something new comes out, maybe a virtualization came out. Oh my God, this is so great. It's going to solve all these problems. And all of a sudden it just got to be this big, more complex thing. Same thing with cloud, you know? It started out with S3 and EC2 and now, you know, hundreds and hundreds of different services. So it's a complex matter for a lot of people. And this creates problems for customers, especially when you've got divisions that are using different clouds. And you're saying that the solution, or a solution for the part of the problem is to really allow the data to stay in place on S3. Use that standard, super simple, but then give it what Ed, you've called superpower a couple of times to make it fast, make it inexpensive and allow you to do that across clouds. I'll give you guys the last word on that. No, I think, listen, I think, we think super cloud allows you to do a lot more. And for us data, everyone says more data, more promise, more budget issue. Everyone knows more data is better and we show you how to do it cost-effectively at scale. And what we couldn't have done it without design principles of we're leveraging the super cloud to get capabilities. And because we use super, just the object storage, we're able to get these capabilities of ingest, scale, cost-effectiveness. And then we built on top of us, in the end, a database to data platform that allows you to go after everything distributed and to get one platform for analytics, no matter where it lands. That's where we think the super cloud concepts are perfect. That's where our clients are seeing it. We're kind of excited about it. Third generation database, super cloud database, how we want to raise it and make it simple but provide the value and make it instant. Guys, thanks so much for coming into the studio today. And really, thank you for your support of the cube and the cube community. It allows us to provide events like this and free content. I really appreciate it. Thank you. All right, this is Dave Vellante for John Furrier in the cube community. Thanks for being with us today. You're watching super cloud too. Keep it right there for more thought-provoking discussions around the future of cloud and data.