 Hey everyone, welcome back to Las Vegas. It's theCUBE live at AWS Koreanvent 2022. This is our fourth day of coverage. Lisa Martin here with Paul Gillin. Paul, we started Monday night. We filmed and streamed for about three hours. We have had jammed pack days, Tuesday, Wednesday, Thursday. What's your takeaway? We're routing final turn as we head into the home stretch. This is, as it has been since the beginning, this show with a lot of energy. I'm amazed for the fourth day of a conference of how many people are still here. I am too. And how active they are, how full the sessions are. Huge proud for the keynote this morning. You don't see that at most of the day four conferences. Everyone's on their way home. So people come here to learn and they're still learning. They are still learning and we're going to help continue that learning path. We have an alumni back with us. Tomas Sharon joins us. The CPO and co-founder of Dremio. Toma, it's great to have you back on the program. Yeah, thanks for having me here and thanks for keeping the best session for the fourth day. Yeah, you're right. I like that. That's good mojo to come into this interview with, Toma. So last year, last time I saw you was a year ago here in Vegas at re-invent 21. We talked about the growth of data lakes and the data lake houses. We talked about the need for open data architectures as opposed to data warehouses. And the headline of the Silicon Angles article on the interview we did with you was Dremio predicts 2022 will be the year open data architectures replace the data warehouse. We're almost done with 2022. Has that prediction come true? Yeah, I think we're seeing almost every company out there certainly in the enterprise adopting data lake, data lake house technology, embracing open source kind of file and table formats. And so I think that's definitely happening. Of course, nothing goes away. So, you know, data warehouses don't go away in a year and actually don't go away ever. We still have mainframes around, but certainly the trends are all pointing that direction. Describe the data lake house for anybody who may not be really familiar with that and what it really means for organizations. Yeah, I think you could think of the data lake house as the evolution of the data lake, right? And so, you know, for the last decade, we've had kind of these two options, data lakes and data warehouses. And, you know, warehouses, you know, having good SQL support and good performance, but you had to spend a lot of time and effort getting data into the warehouse. You got locked into them, very, very expensive. That's a big problem now. And data lakes, you know, more open, more scalable, but had all sorts of kind of limitations. And what we've done now as an industry with a lake house and especially with technologies like Apache Iceberg, is we've unlocked all the capabilities of the warehouse directly on object storage like S3. So you can insert and update and delete individual records. You can do transactions. You can do all the things you could do with a database directly in kind of open formats without getting locked in at a much lower cost. But you're still dealing with semi-structured data as opposed to structured data. And there's work that has to be done to get that into a usable form. That's where Dremio excels. What has been happening in that area that to make, I mean, is it formats like JSON that are enabling this to happen? How are we advancing the cause of making semi-structured data usable? Well, I think first of all, you know, I think that's all changed. I think that was maybe true for the original data lakes, but now with the lake house, our bread and butter is actually structured data. It's all tables with the schemas and you can create table, insert records. It's really everything you can do with a data warehouse you can now do in the lake house. Now, that's not to say that there aren't like very advanced capabilities when it comes to JSON and nested data and kind of sparse data. We excel in that as well. But we're really seeing kind of the lake house take over the bread and butter data warehouse use cases. You mentioned to open a minute ago. Talk about why it's why open is important and the value that it can deliver for customers. Yeah, well, I think if you look back in time and you see all the challenges that companies have had with kind of traditional data architectures, right? A lot of that comes from the problems with data warehouses, the fact that they are, you know, they're very expensive. The data is, you have to ingest it into the data warehouse in order to query it. And then it's almost impossible to get off of these systems, right? It takes an enormous effort, tremendous costs to get off of them and so you're kind of locked in. And that's a big problem, right? You also, you're dependent on that one data warehouse vendor, right? You can only do things with that data that the warehouse vendor supports. And if you contrast that to data lake houses and open architectures where the data is stored in entirely open format. So things like parquet files and Apache iceberg tables, that means you can use any engine on that data. You can use Dramio's SQL query engine. You can use Spark, you can use Flink. You know, there's a dozen different engines that you can use on that, both at the same time, but also in the future, if you ever wanted to try something new that comes out, some new open source innovation, some new startup, you just take and point out the same data. So the data is now at the core, at the center of the architecture, as opposed to some, you know, vendor's logo. Yeah. Amazon seems to be bought into the lake house concept. It has big announcements on day two about eliminating the ETL stage between RDS and Redshift. Do you see the cloud vendors is pushing this concept forward? Yeah, 100%. I mean, Amazon's a great partner of ours. We work with, you know, probably 10 different teams there, everything from, you know, the S3 team, the glue team, the quick site team, you know, everything in between. And, you know, their embracement of the lake house architecture, the fact that they adopted iceberg as their primary table format, I think that's exciting. As an industry, we're all coming together around standard ways to represent data so that at the end of the day, companies have this benefit of being able to, you know, have their own data in their own S3 account in open formats and be able to use all these different engines without losing any of the functionality that they need, right? The ability to do all these interactions with data that maybe in the past, you would have to move the data into a database or warehouse in order to do, you just don't have to do that anymore. Speaking of functionality, talk about what's new this year with Dremio since we've seen you last. Yeah, there's a lot of new things with Dremio. So, yeah, we now have full Apache iceberg support, you know, with DML commands, you can do inserts, updates, deletes, you know, copy into all that kind of stuff is now, you know, fully supported native part of the platform. We now offer kind of two flavors of Dremio. We have, you know, Dremio Cloud, which is our SaaS version, fully hosted. You sign up with your Google or, you know, Azure account and you're up and running in a minute. And then Dremio Software, where you can self-host usually in the cloud, but even outside of the cloud. And then we're also very excited about this new idea of data as code. And so we've introduced a new product that's now in preview called Dremio Arctic. And the idea there is to bring the concepts of Git or GitHub to the world of data. So things like being able to create a branch and work in isolation. If you're a data scientist, you want to experiment on your own without impacting other people, or you're a data engineer and you're ingesting data, you want to transform it and test it before you expose it to others. You can do that in a branch. So all these ideas that, you know, we take for granted now in the world of source code and software development, we're bringing to the world of data with Dremio Arctic. And when you think about data mesh, a lot of people talking about data mesh now and wanting to kind of take advantage of those concepts and ideas, you know, thinking of data as a product. Well, when you think about data as a product, we think you have to manage it like code, right? You have to, and that's why we call it data as code, right? All those reasons that we use things like GitHub to build products, you know, if we want to think of data as a product, we need all those capabilities also with data. You know, also the ability to go back in time, the ability to undo mistakes, to see who changed my data and when did they change that table? All of those are part of this new catalog that we've created. You talk about data as a product, that's sort of intrinsic to the data mesh concept. Are you, what's your opinion of data mesh? Is the world ready for that radically different approach to data ownership? You know, we are now in dozens of our customers that are using Dremio to implement enterprise-wide kind of data mesh solutions. And at the end of the day, I think it's just, you know, what most people would consider common sense, right, in a large organization, it is very hard for a centralized single team to understand every piece of data, to manage all the data themselves, to, you know, make sure the quality is correct and make it accessible. And so what data mesh is first and foremost about is being able to kind of federate or distribute the ownership of data, the governance of the data. It still has to happen, right? And so that is, I think, at the heart of the data mesh, but thinking of data as kind of allowing different teams, different domains to own their own data, to really manage it like a product with all the best practices that we have with that, super important. So we're doing a lot with data mesh, you know, the way that Dremio Cloud has multiple projects and the way that Dremio Arctic allows you to have multiple catalogs and different groups can kind of interact and share data among each other. You know, the fact that we can connect to all these different data sources, even outside your data lake, you know, with Redshift, Oracle, SQL Server, you know, all the different databases that are out there, and join across different databases in addition to your data lake, that's all stuff that companies want with their data mesh. What are some of your favorite customer stories that where you've really helped them accelerate that data mesh and drive business value from it so that more people in the organization kind of access to data so they can really make those data driven decisions that everybody wants to make? I mean, there's so many of them, but you know, one of the largest tech companies in the world creating a data mesh where you have all the different departments in the company that, you know, they were a big data warehouse user, and it kind of hit the wall, right? The costs were so high and the ability for people to kind of use it for just experimentation, to try new things out, to collaborate, they couldn't do it because it was so prohibitively expensive and difficult to use. And so what they said, well, we need a platform that different people can, they can collaborate, they can experiment with the data, they can share data with others. And so at a big organization like that, their ability to kind of have a centralized platform but allow different groups to manage their own data. Several of the largest banks in the world are also doing data meshes with Dremio. One of them has over a dozen different business units that are using Dremio and that ability to have thousands of people on a platform and to be able to collaborate and share among each other, that's super important to these guys. Can you contrast your approach to the market, to Snowflakes? Because they have some of those same concepts. Snowflake's a very close system at the end of the day, right? Close and very expensive, right? I think they, if I remember seeing a quarter ago in one of their earnings reports that the average customer spends 70% more every year, right? Well, that's not sustainable. If you think about that in a decade, that's your cost is going to increase 200X. Most companies not going to be able to swallow that, right? So companies need, first of all, they need more cost efficient solutions that are just more approachable, right? And the second thing is, you know, Dremio, we talked about the open data architecture. I think most companies now realize that if you want to build a platform for the future, you need to have the data in open formats and not be locked into one vendor, right? And so that's kind of another important aspect. Beyond that, Dremio's ability to connect to all your data, even outside the lake, to your different databases, no SQL databases, relational databases. Dremio's semantic layer where we can accelerate queries. And so typically what happens with data warehouses and other data lake query engines is that because you can't get the performance that you want, you end up creating lots and lots of copies of data. For every use case, you're creating a pre-joined copy of that data, a pre-aggregated version of that data. And then you have to redirect all your data towards those individual things. Well, it's expensive. It's expensive, it's hard to secure that because permissions don't travel with the data. You have all sorts of problems with that, right? And so what we've done because of our semantic layer that makes it easy to kind of expose data in a logical way. And then our query acceleration technology, which we call Reflections, which transparently accelerates queries and gives you sub-second response times without data copies and also without extracts into the BI tools. Because if you start doing BI extracts or imports, again, you have lots of copies of data in the organization, all sorts of refresh problems, security problems, it's a nightmare, right? And that just collapsing all those copies and having a simple solution where data stored in open formats and we can give you fast access to any of that data, that's very different from what you get with like a Snowflake or any of these other companies. That's a great explanation. I want to ask you, early this year, you announced that your Dremio Cloud service would be free forever, the basic Dremio Cloud service. How has that offer gone over? What's been the uptake on that offer? Yeah, I mean, it is. And thousands of people have signed up and I think it's a great service. It's very, very simple. People can go on the website, try it out. We now have a test drive as well. If you want to get started with just some sample, public sample data sets and like a tutorial, we've made that increasingly easy as well. But yeah, we continue to take that approach of making it easy, democratizing these kind of cloud data platforms and kind of lowering the barriers to adoption. How effective has it been in driving sales of the enterprise version? Yeah, a lot of business that we do, like when it comes to selling is folks that have educated themselves, right? They started off, they followed some tutorials. I think generally developers, they prefer the first interaction to be with a product, not with a sales person. And so that's basically the reason we did that. Before I ask you the last question, I want to just, can you give us a sneak peek into product roadmap as we enter 2023? What can you share with us that we should be paying attention to where Drumeo is concerned? Yeah, you know, actually a couple days ago here at the conference, we had a press release with all sorts of new capabilities that we just released and there's a lot more for the coming year. We'll shortly be releasing a variety of different performance enhancements. So we'll be in the next quarter or two, we'll be probably twice as fast just in terms of raw query speed. That's in addition to our reflections and our query acceleration. Support for all the major clouds is coming. Just a lot of capabilities in Drumeo that make it easier and easier to use the platform. Awesome. Tomer, thank you so much for joining us. My last question to you is, if you had a billboard in your desired location and it was going to really just feel like a mic drop about why customers should be looking at Drumeo, what would that billboard say? Well, Drumeo is the easy and open data lake house and open architectures is just a lot better, a lot more future proof, a lot easier and just a much safer choice for the future for companies. And so I'd argue with those people to take a look. Exactly. That wasn't the best billboard, so that's okay. I think it's a great billboard. Awesome. Tomer, thank you so much for joining Paul and me on the program and sharing with us what's new, what some of the exciting things that are coming down the pipe quite soon. We're going to be keeping our eye on Drumeo. Awesome, always happy to be here. Thank you. All right, for our guest and for Paul Gillan, I'm Lisa Martin. You're watching theCUBE, the leader in live and emerging tech coverage.