 Welcome to Las Vegas. It's theCUBE live at AWS re-invent 22. Lisa Martin here with Dave Ballanty. Dave, it is not only great to be back, but this re-invent seems to be bigger than last year for sure. Oh, definitely. I'd say it's double last year. I'd say it's comparable to 2019, maybe even a little bigger. I've heard it's the largest re-invent ever. And we're going to talk data, one of our favorite topics. We're going to talk data, data products. We have some great guests. One of them is an alumni who's back with us. Justin Morgan, the CEO of Starburst. And Ashwin Patel also joins us, Principal AI and Data Engineering at Deloitte. Guys, welcome to the program. Thank you. Thank you. Justin, define data products. Give us the scoop what's going on with Starburst. But define data products and the value in it for organizations, a productizing data. So data products are curated data sets that are able to span across multiple data sets. And I think that's what makes it particularly unique is you can span across multiple data sources to create federated data products that allow you to really bring together the business value that you're seeking. And I think ultimately what's driving the interest in data products is a desire to ultimately facilitate self-service consumption within the enterprise. I think that's the holy grail that we've all been building towards. And data products represents a framework for sort of how you would do that. So monetization is not necessarily a criterion. Not necessarily, but it could be. It could be. Can it be internal data products or external data products? And in either case, it's really intended to facilitate easier discovery and consumption of data. Ashwin, bringing you into the conversation, talk about some of the revenue drivers that data products can help organizations to unlock. Sure. Like Justin said, there are internal and external revenue drivers. So internally, a lot of clients are focused around, hey, how do I make the most out of my modernization platform? So a lot of them are thinking about what AI, what analytics, what can they run to drive consumption? And when you think about consumption, consumption typically requires data from across the enterprise, right? And data from the enterprise is sometimes fragmented in pieces and places. So we've gone from being data in too many places to now data products helping bring all of that together and really aid drive business decisions faster with more data and more accuracy, right? Externally, a lot of that has got to do with how the ecosystems are evolving for data products that use not only company data, but also the ecosystem data that includes customers, that includes suppliers and vendors. I mean, conceptually, data products, you could say have been around for a long time when I think of financial services. I think that's always been a data product in a sense. But suddenly, there's a lot more conversation about it. There's data mesh, there's data fabric. We could talk about that too, but why do you think now it's coming to the fore again? Yeah, I mean, I think it's because historically, there's always been this disconnect between the people that understand data infrastructure and the people who know the right questions to ask of the data. Generally, these have been two very distinct groups. And so the interest in data mesh, as you mentioned, and data products as a foundational element of it is really centered around, how do we bring these groups together? How do we get the people who know the data the best to participate in the process of creating data to be consumed? Ultimately, again, trying to facilitate greater self-service consumption. And I think that's the real beauty behind it. And I think increasingly today in today's world, people are realizing that data will always be decentralized to some degree. That notion of bringing everything together into one single database has never really been successfully achieved and is probably even further from the truth at this point in time, given you've got data on-prem in multiple clouds and multiple different systems. And so data products and data mesh represents, again, a framework for you to sort of think about data that lives everywhere. We did a session this summer with Justin and I and some others on the data lies. And that was one of the good of lies, right? Is that there's a single source of truth. Right. And you know, the old adage, we've probably never been further from the single source of truth, but actually you're suggesting that there's maybe multiple truths that the same data can support. Is that the right way to think about it? Yeah, exactly. And I think ultimately you want a single point of access that gives you, at your fingertips, everything that your organization knows about its business today. And that's really what data products aims to do is sort of curate that for you and provide high quality data sets that you can trust, that you can now self-service to answer your business questions. One of the things that, oh, go ahead. No, I was just going to say, I mean, if you pivot it from the way the usage of data has changed, right? Traditionally, IT has been in the business of providing data to the business users. Today, with more self-service being driven, we want business users to be the drivers of consumption, right? So if you take that backwards one step, it's basically saying what data do I need to support my business needs, such that IT doesn't always have to get involved in providing that data or providing the reports on top of that data. So the data products concept, I think supports that thinking of business led technology enabled or IT enabled really well. Business led. One of the things that Adam Salipski talked with John Furrier about just a week or so ago in their pre-reinvent interview was talking about the role of the data analysts going away that everybody in an organization, regardless of function, will be able to eventually be a data analyst and need to evaluate and analyze data for their roles. Talk about data products as a facilitator of that democratization. Yeah, we are seeing more and more the concept of citizen data scientists. We are seeing more and more citizens AI. What we are seeing is a general trend as we move towards self-service. There's going to be a need for business users to be able to access data when they want, how they want and merge data across the enterprise in ways that they haven't done before. Technology today through products like data products provides you the access to do that. And that's why we're going to see this movement of seeing people become more and more self-service oriented where you're going to democratize the use of AI and analytics into the business users. Do you think, when you talk to a data analyst, by the way, about that, he or she will be like, yeah, maybe, good luck with that. So do you think maybe there's sort of an interim step because we've had these highly, Zhamak lays this out very well, we've had these highly centralized, highly specialized teams, the premise being, oh, that's less expensive. Perhaps data analysts like functions get put into the line of business. Do you see that as a bridge or a stepping stone? Because it feels like it's quite a distance between what a data analyst does today and this nirvana that we talk about. What are your thoughts on that? Yeah, I mean, I think there's possibly a new role around a data product manager. Much the way you have product managers in the products you actually build to sell, you might need data product managers to help facilitate and curate the high quality data products that others can consume. And I think that becomes an interesting and important, you know, a skill set. Much the way that data scientist was created as a occupation, if you will, maybe 10 years ago when previously, you know, those were statisticians or other names. Right. You know, a big risk that many clients are seeing around data products is how do you drive governance? And to the point that Justin's making, we are going to see that role evolve where governance in the world where data products are getting democratized is going to become increasingly important in terms of how are data products being generated? How is the propensity of data products towards a more governed environment being managed? And that's going to continue to play an important role as data products evolve. Okay, so how do you guys fit? Because, you know, you take Jamoc's four principles with domain ownership of data as product and that creates two problems. Governance, right? How do you automate, self-service infrastructure and automated governance. Yep. Tell us what role Starburst plays in solving those, you know, all of those, but the latter two in particular. Yeah, well, we're working on all four of those dimensions to some degree, but I think ultimately where we're focused today is the governance piece, providing fine-grained access controls, which is so important. If you're going to have a single point of access, you better have a way of controlling who has access to what. But secondly, data products allows you to really abstract away or decouple where the data is stored from the business meaning of the data. And I think that's what's so key here is if we're going to ultimately democratize data, as we've talked about, we need to change the conversation from a very storage-centric world, like, oh, that table lives in this system or that system or that system and make it much more about the data and the value that it represents. And I think that's what data products aims to do. What about data fabric? I have to say, I'm confused by data fabric. I read this, I feel like Gartner just threw it in there to muck it up and say, no, no, we get to make up the terms. But I've read data mesh versus data fabric. Is data fabric just more sort of the physical infrastructure and data mesh is more of an organizational construct or how do you see it? Yeah, I'm happy to take that. So, I mean, to me, it's a little bit of potato-potato. I think there are some subtle differences. Data fabric is a little bit more about data movement, whereas I think data mesh is a little bit more about accessing the data where it lies. But they're both trying to solve the similar problem, which is that we have data in a wide variety of different data sets, and for us to actually analyze it, we need to have a single view. Because Gartner hype cycle says, data mesh is DOA, which I think is complete BS. I think it's real. You talk to customers that are doing it, they're doing it on AWS, they're trying to extend it across clouds. I mean, it's a real trend. I mean, anyway, that's why I see it. See, a field of word, data fabric, many a times gets misused, because when you think about the digitization movement that happened, started almost a decade ago, many companies tried to digitize or create digital twins of their systems into the data work. So everything has an underlying data fabric that replicates what's happening transactionally or otherwise in the real world. What data mesh does is create structure that works complementary to the data fabric that then lends itself to data products. So to me, data products becomes a medium which drives the connection between data mesh and data fabric into the real world for usage and consumption. You should write for Gartner. That's the best explanation I've heard. That made sense. That really did. That was excellent. So when we think about any company these days has to be a data company, whether it's your grocery store, a gas station, a car dealer, how, what can companies do to start productizing their data so that they can actually unlock new revenue streams, new routes to market? What are some steps and recommendations that you have? Justin, we'll start with you. Sure, I would say the first thing is find data that is ultimately valuable to the consumers within your business and create a product of it. And the way that you do that at Starburst is allow you to essentially create a view of your data that can span multiple data sources. So again, we're decoupling where the data lives. That might be a table that lives in a traditional data warehouse, a table that lives in an operational system like Mongo, a table that lives in a data lake. And you can actually join those together and represent it as a view and now make it easily consumable. And so the end user doesn't need to know did that live in a data warehouse, an operational database, or a data lake? I'm just accessing that. And I think that's a great easy way to start in your journey. Because I think if you absorb all the elements of data mesh at once it can feel overwhelming. And I think that's a great way to start. Irrespective of physical location. Yes. Right? Precisely. So you can use your hybrid cloud, you name it. See, and then you think about the broader landscape, right? For the traditionally companies that have only looked at internal data as a way of driving business decisions. More and more as things evolve into industry clouds or ecosystem data. And companies start going beyond their four walls in terms of the data that they manage or the data that they use to make decisions. I think data products are going to play more and more an important part in that construct where you don't govern all the data that are entities within that ecosystem will govern parts of their data. But that data lives together in the form of data products that are governed somewhat centrally. I mean, kind of like a block hand system, but not really. Justin, for our folks here as we kind of wrap this segment here. What's the bumper sticker for Starburst and how you're helping organizations to really be able to build data products that add value to their organization? I would say analytics anywhere. You know, our core ethos is we want to give you the ability to access data wherever it lives and understand your business holistically. And, you know, our query engine allows you to do that from a query perspective and data products allows you to bring that up a level and make it consumable. Make it consumable. Ashwin, last question for you. Here we are day one of re-invent. Loads of people behind us. Tomorrow all the great keynotes kick up. What are you hoping to take away from re-invent 22? Well, I'm hoping to understand how all of these different entities that are represented here connect with each other, right? And to me, Starburst is an important player in terms of how do you drive connectivity? And to me, as we help clients from a Deloitte perspective, drive that business value, connectivity across all of the technology players is an extremely important part. So integration across those technology players is what I'm trying to get from re-invent here. So you guys do your dot connectors, right? Exactly, excellent. Guys, thank you so much for joining David and me on the program tonight. We appreciate your insights, your time, and probably the best explanation of data fabric versus data mesh. And data products that we've maybe ever had on the show. We appreciate your time. Thank you. Thank you guys. All right, for our guests and Dave Vellante, I'm Lisa Martin. You're watching theCUBE, the leader in enterprise and emerging tech coverage.