 From theCUBE Studios in Palo Alto in Boston, bringing you data-driven insights from theCUBE and ETR. This is Breaking Analysis with Dave Vellante. Data Mesh is a new way of thinking about how to use data to create organizational value. Leading edge practitioners are beginning to implement Data Mesh in earnest. And importantly, Data Mesh is not a single tool or rigid reference architecture, if you will rather. It's an architectural and organizational model that's really designed to address the shortcomings of decades of data challenges and failures, many of which we've talked about in theCUBE. As important by the way, it's a new way to think about how to leverage data at scale across an organization and across ecosystems. Data Mesh in our view will become the defining paradigm for the next generation of data excellence. Hello and welcome to this week's Wikibon Cube Insights, powered by ETR. In this Breaking Analysis, we welcome the founder and creator of Data Mesh author, thought leader, technologist, Jamak Deghani. Jamak, thank you for joining us today. Good to see you. Hi, Dave, it's great to be here. All right, real quick, let's talk about what we're going to cover. I'll introduce or reintroduce you to Jamak. She joined us earlier this year in our Cube on Cloud program. She's the director of emerging tech at ThoughtWorks North America and the thought leader, practitioner, software engineer, architect and a passionate advocate for decentralized technology solutions and data architectures. And Jamak, since we last had you on as a guest, which was less than a year ago, I think you've written two books in your spare time, one on Data Mesh and another called Software Architecture, the hard parts, both published by O'Reilly. So how are you? You've been busy? I've been busy, yes. Good, it's been a great year. It's been a busy year. I'm looking forward to the end of the year and the end of these two books. But it's great to be back and speaking with you. Well, you got to be pleased with the momentum that Data Mesh has. And let's just jump back to the agenda for a bit and get that out of the way. We're going to set the stage by sharing some ETR data, our partner, our data partner on the spending profile and some of the key data sectors. And then we're going to review the four key principles of Data Mesh. Just it's always worthwhile to sort of set that framework. We'll talk a little bit about some of the dependencies and the data flows, and we're really going to dig today into principle number three and to bid around the self-service data platforms. And to that end, we're going to talk about some of the learnings that Shamak has captured since she embarked on the Data Mesh journey with her colleagues and her clients. And we specifically want to talk about some of the successful models for building the Data Mesh experience. And then we're going to hit on some practical advice and we'll wrap with some thought exercises, maybe a little tongue in cheek, some of the community questions that we get. So the first thing I want to do, we'll just get this out of the way, is introduce the spending climate. We use this XY chart to do this. We do this all the time. It shows the spending profiles in the ETR dataset for some of the more data related sectors of the ETR taxonomy. They dropped their October data last Friday. So I'm using the July survey here. We'll get into the October survey in future weeks, but about 1500 respondents. I don't see a dramatic change coming in the October survey, the Y axis is net score or spending momentum. The horizontal axis is market share or presence in the dataset and that red line, that 40%. Anything over that we consider elevated. So for the past eight quarters or so, we've seen machine learning slash AI, RPA containers and cloud is the four areas where CIOs and technology buyers have shown the highest net scores. And as we've said, what's so impressive for cloud, this is both pervasive and it shows high velocity from a spending standpoint. And we plotted the three other data related areas, database, EDW, analytics, BI and big data and storage. The first two well under the red line are still elevated. The storage market continues to kind of plot along and we've plotted the outsourced IT just to balance it out for context. That's an area that's not so hot right now. So I just want to point out that these areas AI automation containers and cloud, they're all relevant to data and their fundamental building blocks of data architectures as are the two that are directly related to database and analytics and of course storage. So it just gives you a picture of the spending sector. So I wanted to share this slide, Jamak, that you presented in your webinar, I love this. It's a taxonomy put together by Matt Turk, who's a VC and he called this the mad landscape machine learning and AI and data. And Jamak, the key point here is there's no lack of tooling. You've made the data mesh concept sort of tools agnostic. It's not like we need more tools to succeed in data mesh, right? Absolutely agree, I think we have plenty of tools. I think what's missing is a meta architecture that defines a landscape in a way that it's in step with organizational growth and then defines that meta architecture in a way that these tools can actually interoperable, interoperate and integrate really well. The clients right now have a lot of challenges in terms of picking the right tool. Regardless of the technology, they go down the path. Either they have to go in and bite into a big data solution and then try to fit the other integrated solutions around it or as you see, go to that menu of large list of applications and spend a lot of time trying to kind of integrate and stitch this tooling together. So I'm hoping that data mesh creates that kind of meta architecture for tools to interoperate and plug in. And I think our conversation today around self-serve data platform hopefully illuminate that. Yeah, we'll definitely circle back because that's one of the questions we get all the time from the community. Okay, let's review the four main principles of data mesh for those who might not be familiar with it and those who are, it's worth reviewing. Jamak, allow me to introduce them and then we can discuss a bit. So a big frustration I hear constantly from practitioners is that the data teams don't have domain context. The data team is separated from the lines of business and as a result, they have to constantly context switch and as such, there's a lack of alignment. So principle number one is focused on putting end-to-end data ownership in the hands of the domain or what I would call the business lines. The second principle is data as a product which does cause people's brains to hurt sometimes but it's a key component. And if you start sort of thinking about it and talking to people who have done it it actually makes a lot of sense. And this leads to principle number three which is a self-serve data infrastructure which we're going to drill into quite a bit today. And then the question we always get is when we introduce data mesh is how to enforce governance in a federated model. So let me bring up a more detailed slide, Jamak, with the dependencies and ask you to comment, please. Sure, as you said, really the root cause we're trying to address is the siloing of the data external to where the action happens, where the data gets produced, where the data needs to be shared, where the data gets used, right? In the context of the business. So it's about the really the root cause of the centralization gets addressed by distribution of the accountability end-to-end back to the domains. And these domains, distribution of accountability, technical accountability to the domains have already happened. In the last decade or so, we saw the transition from one general IT addressing all of the needs of the organization to technology groups within the IT or even outside of the IT, aligning themselves to build the applications and services that the different business units need. So what data mesh does, it just extends that model and say, okay, we're aligning business with the tech and data now, right? So both application of the data in ML or inside generation in the domains related to the domains needs, as well as sharing the data that the domains are generating with the rest of the organization. But the moment you do that, then you have to solve other problems that may arise. And that gives birth to the second principle, which is about data as a product, as a way of preventing data siloing happening within the domain. So changing the focus of the domains that are now producing data from, I'm just going to create that data I collect for myself and that's satisfying my needs to, in fact, the responsibility of domain is to share the data as a product with all of the wonderful characteristics that a product has. And I think that leads to really interesting architectural and technical implications of what actually constitutes data as a product and we can have a separate conversation. But once you do that, then that's the point in the conversation that CIO says, well, how do I even manage the cost of operation if I decentralize building and sharing data to my technical teams, to my application teams, do I need to go and hire another hundred data engineers? And I think that's the role of a self-serve data platform in a way that it enables and empowers generalist technologies that we already have in the technical domains. The majority population of our developers these days, right? So the data platform attempts to mobilize the generalist technologies to become data producers, to become data consumers and really rethink what tools these people need. And the last principle, so the data platform is really to giving autonomy to domain teams and empowering them and reducing the cost of ownership of the data products. And finally, as you mentioned, the question around, how do I still assure that these different data products are interoperable, are secure, respecting privacy? Now, in a decentralized fashion, right? When we are respecting the sovereignty or the domain ownership of each domain, and that leads to this idea of both from operational model, applying some sort of a federation where the domain owners are accountable for interoperability of their data product. They have incentives that are aligned with global harmony of the data mesh, as well as from the technology perspective, thinking about this data as a product with a new lens, with a lens that all of those policies that need to be respected by these data products, such as privacy, such as confidentiality, can we encode these policies as computational executable units and encode them in every data product so that we get automation, we get governance through automation. So that's the relationship, the complex relationship between the four principles. Yeah, thank you for that. I mean, it's just a couple of points. There's so many important points in there, but the idea of the silos and the data as a product, sort of breaking down those cells, because if you have a product and you want to sell more of it, you make it discoverable. And as a P&L manager, you put it out there. You want to share it as opposed to hide it. And then this idea of managing the cost, number three, where people say, well, it's centralized and you can be more efficient, but that essentially was the failure. And your other point, related point, is generalist versus specialist. This is kind of one of the failures of Hadoop, was you had these hyper-specialist roles emerge, and so you couldn't scale. And so let's talk about the goals of Datamesh for a moment. You've said that the objective is to exchange, you call it a new unit of value between data producers and data consumers. And that unit of value is a data product. And you've stated that a goal is to lower the cognitive load on our brains. I love this and simplify the way in which data are presented to both producers and consumers. And doing so in a self-serve manner, that eliminates the tapping on the shoulders or emails or raising tickets. So I'm trying to understand how data should be used, et cetera. So please explain why this is so important and how you've seen organizations reduce the friction across the data flows and the interconnectedness of things like data products across the company. Yeah, I mean, this is important. As you mentioned, initially when this whole idea of a data-driven innovation came to exist and we needed all sorts of technology stacks, we centralized creation of the data and usage of the data. And that's okay when you first get started where the expertise and knowledge is not yet diffused and it's only the privilege of a very few people in the organization. But as we move to a data-driven innovation cycle in the organization, as we learn how data can unlock new programs, new models of experience, new products, then it's really, really important, as you mentioned, to get the consumers and producers talk to each other directly without the broker in the middle. Because even though that having that centralized broker could be a cost-effective model, but if we include the cost of missed opportunity for something that we could have innovated, but we missed that opportunity because of months of looking for the right data, then that cost-benefit parameters and formula changes. So to have that innovation really embedded, data-driven innovation embedded into every domain, every team, we need to enable a model where the producer can directly peer-to-peer discover the data, use it, understand it, and use it. So the litmus test for that would be going from a hypothesis that as a data scientist, I think there is a pattern and there is an insight in the customer behavior that even though if I have access to all of the different informations about the customer, all of the different touch points, I might be able to discover that pattern and personalize the experience of my customer. The litmus test is going from that hypothesis to finding all of the different sources, be able to understand them, be able to connect them, and then turn them into training of my machine learning and the rest is known as an intelligent product. Got it, thank you. So a lot of what we do here in Breaking It Houses is we try to curate and then point people to new resources. So we will have some additional resources because this is not superficial what you and your colleagues in the community are creating. But so I do want to curate some of the other material that you had. So if I bring up this next chart, the left-hand side is a curated description both sides of your observations of most of the monolithic data platforms. They're optimized for control, they serve a centralized team that has hyper-specialized roles as we talked about. The operational stacks are running enterprise software, they're on Kubernetes and the microservices are isolated from let's say the Spark clusters, which are managing the analytical data, et cetera. Whereas the data mesh proposes much greater autonomy and the management of code and data pipelines and policy as independent entities versus a single unit. And you've made the point we have to enable generalists to borrow from so many other examples in the industry. So it's an architecture based on decentralized thinking that can really be applied to any domain, really domain agnostic in a way. Yes, and I think if I pick one key point from that diagram is really or that comparison is the data platforms or the platform capabilities need to present a continuous experience from an application developer building an application that generates some data. Let's say I have an e-commerce application that generates some data to the data product that now presents and shares that data as temporal immutable facts that can be used for analytics to the data scientist that uses that data to personalize the experience to the deployment of that ML model now back to that e-commerce application. So if you really look at this continuous journey, the walls between these separate platforms that we have built needs to come down. The platforms underneath that generate, that support the operational systems versus support the data platforms versus supporting the ML models, they need to kind of play really nice together because as a user, I'll probably fall off the cliff every time I go through these stages of this value stream. So then the interoperability of our data solutions and operational solutions need to increase drastically because so far we've got away with running operational systems and application on one end of the organization, running and data analytics in another and build a spaghetti pipeline to connect them together. Neither of the end are happy. I hear from data scientist, data analyst pointing finger at the application developer saying, you're not developing your database the right way and application point dipping you're saying, my database is for running my application. It wasn't designed for sharing analytical data. So we've got to really, what data mesh as a mesh tries to do is bring these two world together closer because and then the platform itself has to come closer and turn into a continuous set of services and capabilities as opposed to this disjointed big isolated stacks. Very powerful observations there. So we want to dig a little bit deeper into the platform, Jamak and have you explain your thinking here because it's everybody always goes to the platform. What do I do with the infrastructure? So you've stressed the importance of interfaces, the entries to and the exits from the platform. And you've said, you use a particular parlance to describe it. And in this chart, you kind of shows what you call the planes, not layers, the planes of the platform. It's complicated with a lot of connection points. So please explain these planes and how they fit together. Sure. I mean, there was a really good point that you started Dave with that when we think about capabilities that enables build of our application, builds of our data products, build that analytical solutions, usually we jump too quickly to the deep end of the actual implementation of these technologies. Do I need to go buy a data catalog or do I need some sort of a warehouse storage? And what I'm trying to kind of elevate us up and out is to force us to think about interfaces and APIs, the experiences that the platform needs to provide to run this secure safe trustworthy, performant mesh of data products. And if you focus on then the interfaces, the implementation underneath can swap out, right? So you can swap one for the other over time. So that's the purpose of like having those lollipops and focusing and emphasizing what is the interface that provides us such a capability like the storage, like the data product lifecycle management and so on. The purpose of the planes, the mesh experience plane data product experience to the plane is really giving us a language to classify different set of interfaces and dedicated capabilities that play nicely together to provide that cohesive journey of a data product developer, data consumer. So then the three planes are really around, okay, at the bottom layer, we have a lot of utilities. We have that Mac Terps, you know, kind of Mac data tooling chart. So we have a lot of utilities right now, they manage workflow management, you know, they do data processing, you know, you've got your spark link, you've got your storage, you've got your late storage, you've got your time series of storage, you've got a lot of tooling at that level. But the layer that we kind of need to imagine and build today we don't buy yet as long as I know is this layer that allows us to exchange that unit of value, right? To build and manage these data products. So the language and the API is an interface of this product, this product experience plane is not, oh, I need this storage or I need that, you know, workflow processing, is that I have a data product, it needs to deliver certain types of data. So I need to be able to model my data. It needs to, as part of this data product, I need to write some processing code that keeps this data constantly alive because it's receiving, you know, upstream, let's say user interactions with the websites and generating the profile of my users. So I need to be able to write that, I need to serve the data, I need to keep the data alive and I need to provide a set of SLOs and guarantees for my data so that good documentation, so that someone who comes to the data product knows what's the cadence of refresh, what's the retention of the data and a lot of other SLOs that I need to provide. And finally, I need to be able to enforce and guarantee certain policies in terms of access control, privacy, encryption and so on. So as a data product developer, I just work with this unit, a complete autonomous self-contained unit and the platform should give me ways of provisioning this unit and testing this unit and so on. That's why kind of I emphasize on the experience. And of course, we're not dealing with one or two data product, we're dealing with a mesh of data products. So at the kind of mesh level experience, we need a set of capabilities and interfaces to be able to search the mesh for the right data, to be able to explore the knowledge graph that emerges from this interconnection of data products and need to be able to observe the mesh for any anomalies. Did we create one of these giant master data products that all the data goes into and all the data comes out of? Have we found ourselves a bottleneck? So be able to kind of do those level mesh level capabilities we need to have access in level of APIs and interfaces. And once we decide what constitutes that to satisfy this mesh experience, then we can step back and say, okay, now what sort of a tool do I need to be able to buy to satisfy them? And that's, that is not what the data community or data part of our organizations used to. I think traditionally we're very comfortable with buying a tool and then changing the way we work to serve the tool. And this is slightly inverse to that model that we might be comfortable with. Right, and pragmatists will tell you people who've implemented data mesh, they'll tell you they spent a lot of time figuring out data as a product and the definitions there, the organizational, the getting domain experts to actually own the data and they will tell you, look, technology will come and go. And so to your point, if you have those lollipops and those interfaces, you'll be able to evolve because we know one thing's for sure in this business technology is going to change. So you had some practical advice and I wanted to discuss that for those that are thinking about data mesh. I scraped this slide from your presentation that you made. And by the way, we'll put links in there. Your colleague Emily, who I believe is a data scientist, had some really great points there as well that practitioners should dig into. But you made a couple of points that I'd like you to summarize. And to me, the big takeaway was, it's not a one and done. This is not a 60 day project, it's a journey. And I know that's kind of cliche, but it's so very true here. Yes. This was a few starting points for people who are embarking on building or buying the platform that enables the mesh creation. So it was a bit of a focus on kind of the platform angle. And I think the first one is what we just discussed. Instead of thinking about mechanisms that you're building, think about the experiences that you're enabling, identify who are the people. What is the persona of data scientist? I mean, data scientist has a wide range of personas or data product developer the same. What is the persona I need to develop today or enable and empower today? What skills do they have? And so thinking about experiences with mechanisms, I think we are at this really magical point. I mean, how many times in our lifetime we come across a complete blanks, kind of white space to a degree to innovate. So let's take that opportunity and use a bit of a creativity while being pragmatic. Of course, we need solutions today or yesterday, but to still think about the experiences, not mechanisms that you need to buy. So that was kind of the first step. And the nice thing about that is that there is an evolutionary, there's an iterative path to maturity of your data mesh. I mean, if you start with thinking about, okay, which are the initial use cases I need to enable? What are the data products that those use cases depend on that we need to unlock? And what is the persona of my general skill set of my data product developer? What are the interfaces I need to enable? You can start with the simplest possible platform for your first two use cases. And then think about, okay, the next set of data developers, they have a different set of needs. Maybe today I just enable the SQL-like querying of the data. Tomorrow I enable the data scientist, file-based access of the data. The day after I enable the streaming aspect. So have this evolutionary kind of path ahead of you. And don't think that you have to start with building out everything. I mean, one of the things we've done is taking this harvesting approach that you work collaboratively with those technical cross-functional domains that are building the data products and see how they are using those utilities and harvesting what they are building as the solutions for themselves back into the platform. But at the end of the day, we have to think about mobilization of the largest population of technologies we have. We have to think about diffusing the technology and making it available and accessible by the generous technologies. And we've come a long way. Like we've gone through these sort of paradigm shifts in terms of mobile development, in terms of functional programming, in terms of cloud operation. It's not that we are struggling with learning something new, but we have to learn something that works nicely with the rest of the tooling that we have in our toolbox right now. So again, put that generalist as one of your center personas, not the only person. Of course, we will have specialists. Of course, we will always have data scientist specialists, but any problem that can be solved as a general kind of engineering problem. And I think there's a lot of aspects of data mesh that can be just a simple engineering problem. Let's approach it that way and then create the tooling to empower those generalists. Great, thank you. So listen, I've been around a long time. And so as an analyst, I've seen many waves. And we often say language matters. And so I mean, I've seen it with the mainframe language. It was different than the PC language. It was different than internet, different than cloud, different than big data, et cetera, et cetera. So we have to evolve our language. So I was going to throw a couple of things out here. I often say data is not the new oil because data doesn't live by the laws of scarcity. We're not running out of data. But I get the analogy, it's powerful, it powered the industrial economy, but it's bigger than that. What do you feel? What do you think when you hear the data is the new oil? Yeah, I don't respond to those data as the gold or oil or whatever scarce resource because as you said, it evokes a very different emotion. It doesn't evoke the emotion of, I want to use this, I want to utilize it. It feels like I need to kind of hide it and collect it and keep it to myself and not share it with anyone. It doesn't evoke that emotion of sharing. I really do think that data, and I, with a little asterisk, and I think the definition of data changes. And that's why I keep using the language of data product or data quantum. Data becomes the most important essential element of existence of computation. What do I mean by that? I mean that a lot of applications that we have written so far are based on logic, imperative logic. If this happens, do that and else do the other. And we're moving to a world where those applications generating data that we then look at and the data that's generated becomes the source, the patterns that we can exploit to build our applications. I mean, curate the weekly playlist for Dave every Monday based on what he has listened to and other people has listened to based on his profile. So we're moving to the world that it's not so much about applications using the data necessarily to run their businesses that data is really truly, is the foundation of building block for the applications of the future. And then I think in that, we need to rethink the definition of the data. And maybe that's for a different conversation, but that's, I really think we have to converge the processing that the data together, the substance and the processing together to have a unit that is a composable reusable trustworthy. And that's the idea behind that kind of data product as an atomic units of what we build for future solutions. Got it. Now, something else that I heard you say or read that really struck me because it's another sort of often stated phrase which is data is our most valuable asset. And you push back a little bit on that when you hear people call data an asset, people said, often have said they think data should be or will eventually be listed as an asset on the balance sheet. And in hearing what you said, I thought about that and said, well, you know, maybe data as a product, that's an income statement thing. That's generating revenue or it's cutting costs. It's not necessarily, because I don't share my assets with people. I don't make them discoverable. That's a color to this discussion. So I think it's actually interesting you mentioned that because I read there a new policy in China that see it was actually have a line item around the data that they capture. We don't have to go to the political conversation around authoritarian of collecting data and the power that creates and the society that leads to, but that aside, that big conversation, little conversation aside, I think you're right. I mean, the data as an asset generates a different behavior. It creates different performance metrics that we would measure. I mean, before conversation around data mesh came to, you know, kind of exist, we were measuring the success of our data teams by the terabytes of data that we're collecting by the thousands of tables that they had, you know, stamped as golden data. None of that leads to necessarily, there's no direct line I can see between that and actually the value that data generated. But if we invert that, so that's why I think it's rather harmful because it leads to the wrong measures, metrics to measure for success. So if you invert that to a bit of a product thinking or something that you share to delight the experience of users, your measures are very different. Your measures are the happiness of the user, they decrease lead time for them to actually use and get value out of it. They're, you know, the growth of the population of the users. So it works a very different kind of behavior and success metrics. I do say, if I may, that I probably come back and regret the choice of word around product one day because of the monetization aspect of it, but maybe there is a better word to use, but that's the best I think we can use at this point in time. Why do you say that, Jamak? Because it's too directly related to monetization. That has a negative connotation or it might not apply in things like healthcare or? I think because if we want to take a short cut and I remember this conversation years back that people think that the reason to, you know, kind of collect data or have data so that we can sell it. You know, it's just the monetization of the data and then we have this idea of the data market places and so on. And I think that is actually the least valuable, you know, outcome that we can get from thinking about data as a product that directs sell an exchange of data as a monetary exchange of value. So I think that might redirect our attention to something that really matters, which is enabling using data for generating ultimately value for people, for the customers, for the organizations, for the partners as opposed to thinking about it as a unit of exchange for money. I love data as a product. I think your instinct was right on. And I think I'm glad you brought that up because I think people misunderstood, you know, in the last decade, data, selling data directly. But really what you're talking about is using data as an ingredient to actually build a product that has value. And value either is generate revenue, cut cost or help with a mission like this could be saving lives but in some way for a commercial company, it's about the bottom line. And that's just the way it is. So I love data as a product. So I think it's going to stick. So one of the other things that struck me in one of your webinars was one of the Q and A, one of the questions was, can I finally get rid of my data warehouse? So I want to talk about the data warehouse, the data lake, JPMC use that term, the data lake, which some people don't like. I know John Furrier, my business partner doesn't like that term, but the data hub. And one of the things I've learned from sort of observing your work is that whether it's a data lake, a data warehouse, data hub, data, whatever, it should be a discoverable node on the mesh. It really doesn't matter the technology. What are your thoughts on that? Yeah, I think that the really shift is from a centralized data warehouse to data warehouse where it fits. So I think if you just cross that centralized piece, we are all in agreement that data warehousing provides, interesting capabilities that are still required, perhaps as an edge node of the mesh that is optimizing for certain queries, I say financial reporting, and we still want to direct a fair bit of data into a node that is just for those financial reportings and it requires the precision and the speed of operation that the warehouse technology provides. So I think definitely that technology has a place where it falls apart is when you want to have a warehouse to rule all of your data and model, canonically model your data, because you have to put so much energy into kind of try to harness this model and create this very complex, the complex and fragile snowflake schemas and so on that that's all you do, you spend energy against the entropy of your organization to try to get your arms around this model and the model is constantly out of step with what's happening in reality because reality of the business is moving faster than our ability to model everything into one canonical representation. And I think that's the one we need to challenge not necessarily application of data warehouse on a node. I want to close by coming back to the issues of standards. You've specifically envisioned data mesh to be technology agnostic as I said before and of course everyone myself included we're going to run a vendors technology platform through a data mesh filter. The reality is per the Matt Turk chart we showed earlier there are lots of technologies that can be nodes within the data mesh or facilitate data sharing or governance, et cetera but there's clearly a lack of standardization I'm sometimes skeptical that the vendor community will drive this but maybe like Kubernetes, Google or some other internet giant is going to contribute something to open source that addresses this problem but talk a little bit more about your thoughts on standardization, what kinds of standards are needed and where do you think they'll come from? Sure, I mean, you're right that the vendors are not today incentivized to create those open standards because majority of the, not all of them but some vendors operational model is about bring your data to my platform and then bring your computation to me and all will be great. And that will be great for a portion of the clients and portion of environments where that complexity we're talking about doesn't exist. So we need, yes, other players perhaps maybe some of the cloud providers or people that are more incentivized to open their platform in a way for data sharing. So as a starting point, I think standardization around data sharing. So if you look at the spectrum right now we have a de facto standard it's not even a standard for something like SQL I mean, everybody's bastardized SQL and extended it with so many things that I don't even know what the standard SQL is anymore but we have that for some form of a querying but beyond that I know for example folks at Databricks to start to create some standards around the Delta sharing and sharing the data in different models. So I think data sharing as a concept the same way that APIs were about capability sharing. So we need to have the data APIs or analytical data APIs and data sharing extended to go beyond simply SQL or languages like that. I think we need standards around computational policies. So this is again something that is formulating in the operational world. We have a few standards around how do you articulate access control? How do you identify the agents who are trying to access with different authentication mechanism? We need to bring some of those our data specific articulation of policies something as simple as identity management across different technologies is nonexistent. So if you want to secure your data across three different technologies there is no common way of saying who's the agent that is acting to access the data? Can I authenticate and authorize them? So those are some of the very basic building blocks and then the gravy on top would be new standards around enriched kind of semantic modeling of the data. So we have a common language to describe the semantic of the data in different nodes and then relationship between them. We have prior work with RDF and folks that we're focused on I guess linking data across the web with the kind of the data web I guess worked out we had in the past we need to revisit those and see their practicality in the enterprise context. So data modeling and rich language for data semantic modeling and data connectivity most importantly I think those are some of the items on my wish list. That's good. Well, we'll do our part to try to keep the standards push that movement. Jamak we're going to leave it there. So grateful to have you come on to theCUBE really appreciate your time. It's just always a pleasure. You're such a clear thinker. So thanks again. Thank you Dave. It's wonderful to be here. Now we're going to post the number of links to some of the great work that Jamak and her team and her books. And so you can check that out because remember we publish each week on siliconangle.com and wikibon.com and these episodes are all available as podcasts wherever you listen to just search breaking analysis podcasts. Don't forget to check out etr.plus for all the survey data. Do keep in touch. I'm at D. Volante. Follow Jamak D Z-H-A-M-A-K-D. Or you can email me at david.volante at siliconangle.com comment on the LinkedIn post. This is Dave Volante for theCUBE Insights powered by ETRB. Well, and we'll see you next time.