 Well, welcome to the special CUBE Conversation. I'm John Furrier with theCUBE, your host. We're here with Jremio and Isha Sharma, Director of Product Management for Jremio. We're going to talk about data, data lakes, the future of data and how it works with cloud and in the new applications. Isha, thanks for joining me. Thank you for having me, John. So thanks for coming. You guys are a cutting-edge startup. You've got a lot of good action going on. You're kind of on the new guard as Andy Jassy at AWS always talks about this. The old guard incumbents. You guys are on the new breed. You guys are doing the new stuff around data lakes and also making data accessible for customers. What is that all about? Take us through. What is Jremio? So Jremio is the data lake service that essentially allows you to very simply run SQL queries directly on your data lake storage without having to make any of those copies that everybody is going on about all the time. So you're really able to get that fast time without having to have this long process of let's put in a request to my data team, let's make all of those copies and then finally get this very reduced scope of your data and still have to go back to your data team every time you need a change to that. So Jremio is bringing you that fast time to value with that new copy data strategy and really providing you the flexibility to keep your data in your data lake storage as the single source of truth. The past 10 years we've watched with CUBE coverage as we've been doing this program and in the community following from the early days of Hadoop to now, we've seen the trials and tribulations of ETL, data warehousing, we've seen the starts and stops and we've seen that the most successful formula has been store everything. And then the ease of use became a challenge. I don't want to have to hire really high powered engineers to manage certain kinds of clusters. I just got cloud now comes into the mix. I got on-premise storage, but the notion of a data lake became hugely popular because it became a phrase meant store everything. And it meant different things to different peoples. And since then teams of people have been hired to be the data teams. So it's kind of new. So I got to ask you, what is the challenge of these data teams? What do they look like? What's the psychology going on with some of the people on these teams? What problems are they solving? What's going on? Because they're becoming data full and the data keeps coming in, it's not stopping. What's going on with data teams? Take us through what's going on with data teams. Well, to your point, the volumes, the variety of data is just growing exponentially every day. There's really no end to it, right? And companies are looking to get their hands on as much data as they possibly can. So that means data teams in a position to how do I provide access to as many users as easily as possible, that self-service experience for data and data devocatization, as much of a great concept as it is in theory, it comes with its own challenges in terms of all of those copies that end up being created to provide the quote unquote self-service experience. And then with all of these copies comes the cost to store all of them. And you've just added a tremendous amount of complexity and delayed your time to value significantly. You mentioned self-service is one of those things that seems like a moving train. Everyone I talked to is like, oh, self-service is the holy grail. We got to get to self-service. It's almost, and then you get to some self-service and you got to rethink it because more stuff's changing. So I have to ask you, in that capacity, you've got data architects and you've got analysts, the customer of the data. How's the, what's the relationship between those two? Is who gives and who gets, who drives it, who leans in, do the analysts feed the requirements in? Do the architects set up the boundaries? How is that relationship? Can you take us through how you guys view the relationship between the data analysts and architects? I mean, data architect and the data analysts. Sure. So you have the data architect, the data team that's actually responsible for providing data access at the end of the day, right? They're the people that have the data democratization requirement on them. And so they've created these copies. Tremendous amount of copies. A lot of the times the data lake storage is that source of truth, but you're copying your data into a data warehouse. And then what they end up doing is your end user, your analysts, they want, they all want different types of data. They want different views of this data. So there's a tremendous amount of personalized copies that the architect end up creating. And then on top of it, there's performance. We need to get everything back in a timely manner. Otherwise, what's the point, right? Real-time analytics. So there's all these performance-related copies, whether that be at the tables or, you know, PI extracts, Qs, all of that fun stuff. And so the architect is the one that's responsible for creating all of those. That's what they have to do to provide access to the analysts. And then, like I was saying, when we need an update to that data set, when I discover that I have a new data set that I need to join with an existing one, I have the analyst go to the data architect and say, hey, by the way, I need this new data set. Can you make this usable for me or can you provide me access? And so then the data architect has to process that request now. And so, again, coming back to all these copies that have been created, the data architect goes through a tremendous amount of work and almost has to do this over and over again to actually make the data available to the analysts. So it's this cycle that goes on between the two. Yeah, it's a string dynamic. It's a power dynamic, but also trying to get through the innovation. I've got to ask you, some people are saying that data copies are the major obstacle for democratization. How do you respond to that? What's your view? They absolutely are. Data copies are the complete opposite of data democratization. There's no aspect of self-service there, which is exactly what you're looking to do with data democratization. Because of those copies, how do you manage those? How do you govern those? How, like I was saying, when somebody needs a new data set or an update to one, they have to go back to that data team and there goes that self-service. Actually, data copies create a bottleneck because it all comes back to that data team that has to continue to get through the requests that are coming in from their analysts. So data copies and data democratization is completely opposite. You know, I remember talking to Dave Vellante at a CUBE event years ago. He said infrastructure as code was the big DevOps movement. And we felt that data ops would be something similar where data as code, where you didn't have to think about it. So you're kind of getting to this idea of, you know, copies are bad because it doesn't hold back that self-service. This modern error is looking for more of programmability with data. Kind of what you're teasing out here is that's the modern architecture. Is that how you see it? How do you see a modern data architecture? Yeah, so the modern data, or the data architecture has evolved significantly in the last several years, right? We started with traditional data warehouses and the traditional data lake with Hadoop where the storage and compute were totally tightly coupled. And then we moved on to cloud data warehouses where there was a separation of compute and storage and then provided a little more flexibility there. But then with the modern data architecture, you know, with cloud data lakes, you have this aspect of separating not only storage and compute, but also compute and data. So that creates a separate tier for data altogether. What does that look like? So you have your data in your A1 storage, S3, ADLS, whatever it may be. And on top of that, oh, of course it's an open format, right? And so on top of that, thanks to technologies like Apache Iceberg and Delta Lake, there's this ability to give your file, your data, a table structure. And so that starts to bring the capabilities that a data warehouse was providing to the data lake. Thanks to these, you have the ability to do transactions, record level mutation, versioning, things that were missing completely from a data lake architecture before. And so introducing that data tier, having that separation of compute and data really, really accelerate the ability to get that time to value because you're keeping your data in the data lake storage at the end of the day. And it's interesting you see all the hot companies tend to have that kind of mindset and architecture and it's creating new opportunities. There's a ton of white space. So I have to kind of ask you guys, how does Dremio fit into this? Because you guys are playing in this kind of the new wave here with data, it's growing extremely, it's moving fast. You got, again, an edge is developing, more data is coming in at the edge. You've got hybrid, potentially multi-cloud environments on the horizon. I mean, this is ultimate multi-cloud, but I mean data in real time across multiple clouds is the next kind of area people have focused on what's the role of Dremio and all this, take us through that. Yeah, so Dremio provides, again, like I said, this data lake service and we're not referring to just storage or Hadoop when we say data lake, we're talking about an entire solution. So you keep your data in your data lake storage. And then on top of that, with the integrations that Dremio has with Apache Iceberg and Delta Lake, we do provide that data tier that I was talking about. And so you've given your data this table structure and now you can operate on it like you would in a data warehouse. So there's really no need to move your data from the data lake to the data warehouse, again, keeping that data lake as that source of truth. And then on top of that, when we talk about copies, personalized copies, performance related copies, you really, like I was saying, you've created so much complexity with Dremio, you don't do that. When it comes to personalized copies, we've got the semantic layer and that's a very key aspect of Dremio where you can provide as many views of data that you want without having to make any copies. So it really accelerates that data democratization story. And then when it... So it's the no copy strategy, Dremio, you guys are on it, but you're about no copy. Keep semantic layer, have that be horizontal across whatever environment. And does applications have... Can applications tap into this or how do you guys integrate into apps? I'm an app developer, for instance. How does that work? Of course. So that's one of the most important use cases in the sense that when there's an application or even when it's a BI client or some other tool that's tapping into the data as three or ADLS, a lot of people see performance degradation, typically. With the Dremio, that's not the case. We've got AeroFlight integrated into Dremio. It's a key component as well. And that puts so much ease in terms of running your dashboards off of that, running your analytics apps off of that because AeroFlight can deliver 20 times the performance that PioDBC could. So coming back to the no data strategy or no copy data strategy, there's no need for those local copies anymore that you needed to make. So one of the things I got to ask you is because this comes up all the time, especially at less pass-rein than I noticed. Again, Amazon was banging on this hard, Azure as well on their side too. Their whole thing is we want to take the AI environment and make it so that normal people can use it and deploy machine learning. The same thing kind of comes down to this layer. What you're talking about is this democratization is a huge trend because you don't have to be super peaked math, PhD, data scientists or ETL or data wrangler. You just want to actually code the data or play a party with the data in any way you want to do with it. So the question I have is that that's certainly a great trend and no one debates that. But the reality is people are storing data like almost hoarding it. Just throw it in a data lake and we'll deal with it later. How does you guys solve that problem? Because once that starts happening, do you have to hire someone super smart to dig that out or re-architect it? Because that seems to be kind of the pattern, right? Throw everything into a data lake and we'll deal with it later. They call it a data swamp and it's like, no one knows what's going on. Of course, so you don't actually want to throw everything into a data lake. There still needs to be a certain amount of structure that all of this lands in. You want it to live in one place, but have still a little bit of structure so that Dramio and other services are much more enabled to query that with fantastic performance. So there's still some amount of structure that needs to happen at a data lake level but from that semantic layer that we have with Dramio you're creating structure for your end user. How would you advise someone who wants to hedge their future and not take on too much technical depth that says, hey, you know, I do have to store it. Is there a best practice on kind of some guard rails around getting going? How would you, how do you advise your customers who want to get going? So how we advise our customers is again, plug in, put your data in that data lake. A lot of them already have 3 ELS in place and getting started with Dramio is really easy. I would say I did it for the first time and it took a matter of minutes, if not less. And so what you're doing with Dramio is connecting directly to that data source and then creating a semantic layer on top. So you bring together a bunch of data that's sitting in your data lake, that's sales data and so be it. And we give you a really streamlined way to say, put together the last however, go back in time, create a view on top of all of that. If you have that structure and folders, that's great. We will provide you a way to create one view on top of all of that as opposed to having a view for every day or whatnot. And so again, that semantic layer really comes in handy when you're trying to, as the architect, provide access to this data lake. And then as the user who just interacts with the data as the views are provided to them, there's really, there's a whole lot of transparency there that's really easy to get up and running with Dramio. I'm looking forward to it. I gotta finally ask the question is how do I get started? How do people engage with you guys? Is it, is it freemium? Is it cloud service? What's the requirements? What are some of the ways that people can engage and work with you guys? Yeah, so we get started on our website at Dramio.com and speaking of self-service, we've got a virtual lab at Dramio.com slash labs that you get started with that gives you a product tour it even gives you a getting started walkthrough that takes you through your first query so that you can see how well Dramio works. And in addition to that, we've got a free trial of Dramio available on AWS marketplace. Awesome, that marketplace is a good place to download stuff. So okay, I'll ask you a personal question, Isha. You're the director of product management. You get to see inside the kitchen where everyone's making the product. You also got the customer relationships out there. You're looking at product market fit. As it evolves, customers requirements evolve. What's for the cool things that you've seen in this space? That's just interesting to you that either you kind of expect it or maybe some surprises. What's the coolest thing you've seen come out of this new data environment we're living in? I think just the ability to, the way things have evolved, right? It used to be data lake or data warehouse. And you pick one, you probably have both but you're not leveraging either to their highest potential. Now you've got this coming together of both of them. I think it's been fantastic to see how you've got technology is like iceberg and Delta Lake bringing those two things together. And you're in your data lake and it's great in terms of cost and storage and all of that. But now you're able to have so much flexibility in terms of some of those data warehouse capabilities. And on top of that, with technologies like Dremio and just in general this open format concept, you're never locked in with a particular vendor with a particular format. You're not locking yourself out of a technology that you don't even know exists yet. And I think in the past, you were always gonna end up there. You always ended up putting your data in something where it was gonna be difficult to change it to get it out. But now you have so much flexibility with the open architecture that's coming. What's the DNA like at the culture at Dremio? Now, so you've got a cutting edge. You're in a big hot wave data. You're enabling a lot of value. What's it like there at Dremio? What do you guys strive for? What's the purpose? What's the DNA of the culture? There's a lot of excitement in terms of getting customers to this flexibility to get them out of things they're locked into really in providing them with accessibility to their data, right? This data access, data democratization concept to make that actually happen. So that, you know, time to value it's a key thing. You want to derive insights out of your data and everybody at Dremio is super excited and charging towards that. Unlocking that value, that's awesome. Isha, thank you for coming on the CUBE conversation. Great to see you. Thanks for coming on, appreciate it. Thank you. Isha Sharma, Director of Product Management, Dremio here inside the CUBE. I'm John Furrier, your host. Thanks for watching.