 Hello and welcome. My name is Shannon Kemp and I'm the Chief Digital Officer for Data Diversity. We want to thank you for joining the latest in the monthly webinar series Data Architecture Strategies with Donna Burbank. Today Donna will discuss data mesh or data mesh separating reality from hype sponsored today by Couchbase and Monte Carlo. Just a couple of points to get us started due to the large number of people that attend these sessions you will be muted during the webinar. For questions you'll be collecting them via the Q&A panel. And if you'd like to chat with us and with each other, we certainly encourage you to do so. And just to know the chat defaults to send to just the panelists, but you may absolutely change that to network with everyone. To open the chat and the Q&A panels, you'll find those icons in the bottom of your screen to activate those features. And as always, we will send a follow up email within two business days containing links to the slides and the recording of the session and any additional information requested throughout the webinar. And now let me turn it over to Jeff for a brief word from our sponsor Couchbase Jeff. Hello, and welcome. Hi Shannon thank you so much, and I will take over share for a second. I want to talk to everybody about one of the core problems that we're going to end up finding in these data mesh architectures and offer some solutions perhaps as to how they might be resolved when we consider what's been happening inside the market. So we see the need for delivering killer experiences to your end user clientele or your field or your field service workers or whomever is actually logging into your system. So I'm taking a very application oriented approach to this particular problem space. We want to be able to develop really efficiently, we want to deploy efficiently and effectively across clouds and support you know much more modern kinds of capabilities that you might find in modern databases. But here and the thing that we end up seeing very, very frequently is there's four real reasons why people come and talk to Couchbase at all. One is because you have applications and your databases that are powering those applications are failing in performance, or you need to, you know, dramatically improve flexibility, like perhaps you want to better personalize your application. And your, your relational database won't allow you to do that. The third is mobility and the ability to move applications or push applications all the way to the edge, or beyond and I'll show you a couple of examples of that in a second, or drive down your overall cloud costs and I'll show you why the, the, the ongoing use of multiple databases within an application or within a data architecture is not only creating this, let's say, data mess sprawl in a data mesh, but is also becoming more and more obsolete right that notion of using purpose build databases for particular tasks. I want to just let you know that couch base does live in your life every single day, your credit card transactions there's a high likelihood that those are being managed by the FICO Falcon system and therefore couch base. If you've ever cruised on Princess cruise line over the last few years, that cool little medallion that they give you that is your room key and your payment device. You know, and then that also knows what your personalization what your breakfast preferences are what your kids like to do at Carnival cruise that's you know that's couch base to your, your feed in your LinkedIn profile or in your LinkedIn environment that's powered by couch base, as is your shopping experience when you're looking for office supplies at companies like staples. So you really are actually interacting with couch base on practically a daily basis. So that's what I what we're noticing in the market right now and it remembers 2023 not 2011. This data sprawl has as happened with messy messy meshes as I'm kind of characterizing, which has created a lot of both database management challenges, as well as a cost challenges. So you're not using a cash or no sequel system to manage your account profiles but a relational database to manage transactions or, you know, a different analytics system to to manage your analytics. All of that has created the problem of, you know, that we're seeing right now, but really what you ideally you would want to deploy a multi model database that has these core capabilities built inside of it. And remember couch bases built from the merger of mem cash D and couch DB, those teams merged together and rewrote a brand new database called couch base. It's really the original multi model database with an integrated cash and key value access to your data as well as the flexibility of being able to manipulate or change a JSON data store, you get relational capabilities like asset transactions and joins across document collections. Next search, you get analytics built in, you get eventing capabilities built in, and you get geographic replication and synchronization also built in for mobility sake. And what we're seeing right now is one of the things I want to talk to Donna about is the notion of using JSON not only as the vehicle that contains the your your data, your specific data, but also as a payload for metadata and managing a data mesh or data catalog, right, those two things can be combined and we see areas where this might be really, really interesting looking ahead. Of course, catch face the other advantage that we offer is we're super easy. If you know sequel then you already know how to use couch base our query languages called sequel plus plus, we used to call it nickel. But that it includes joins sub queries nested objects, all of the things you would expect and it reads exactly as sequel does. So the last thing I want to talk about real quick is the upswing in the, the need for offline first or mobile applications for field service for utility work for you know rescue your QR code based menus, or for mobile pop up and that this whole notion of supporting you know data at the edge is a mesh problem that we're going to see perpetuate in the future. So, closing up integrated capabilities of the customer you get better release cycles less duplication, you can scale more easily, you get easier data catalog management etc all within the couch base environment. So Shannon I think that concludes my particular prepared remarks and I'm looking forward to the rest of the event. Thank you so much very interesting and a few. Thanks for kicking us off if you have questions for Jeff about couch base you may submit your questions in the Q&A, as he'll be joining us in the Q&A portion of the webinar at the end. And now let me turn over to Shane for brief word from our second sponsor Monte Carlo Shane hello and welcome. Hi everyone. Thanks for having me here today. I want to give a very brief overview of Monte Carlo. Monte Carlo is the creator of the data observability category. We've recently released this book on data quality fundamentals. I'm the field CTO at Monte Carlo being here since last year and prior to that I ran data at the New York Times for nine years from 2013 to 2021. And we're a series to start up with hundreds of customers now in just the couple of years who have been operating. What the problem is that Monte Carlo is solving is one that we refer to as data downtime data downtime is periods of time when your data is missing erroneous late or otherwise. So in error and not able to be used. And the real problem with data downtime that you see at companies of all sizes is that the data producers who own the source systems can't see downstream. They don't know when they're making changes who those changes can impact what sort of reports have built off of the data from their environments. They're analysts or the data scientists at the end of this chain, the data consumers often can't see upstream so they don't know when they find an error if it's actually an insight or if it's an error in a pipeline. And then data engineers tend to be caught in the middle they own the data platform they own the warehouse and the lakes and much of the data in it, and they can't predict all the ways that data will break. And so typically in the past this was solved via manual testing. And we've seen that manual testing barely covers 10% of the space. And what you generally find is your issues your data instance are found by downstream consumers, and it results in a loss of trust in data. The consequences I can talk about this in terms of this chart but the consequences can be trivial lost engineering time, but we do see that about 30 to 50% of engineering time is lost to data incidents. But they scale up to revenue losses for the business loss of reputation and loss of trust. And this is particularly big in data mesh cultures and a lot of multi colors customers are at various stages of data mesh roll out where you're trying to build trust in data across a decentralized roll out. And so you need a self service platform a self service data platform that includes observability and allows you to build trust in data and assign accountability across the organization. So the way Monte Carlo solves this, you know, we've typically found across these hundreds of customers that data downtime incidents look similar. They include things like is the data up to date. Are they values suspiciously high. There's too many knowledge duplicate IDs. And so we wrap it up in these five pillars of observability. These include freshness volume quality schema and lineage to bring it all together and allow you to see upstream and downstream to efficiently resolve issues in data. And so this is essentially what a data observability platform does it allows you to use those five pillars in the background collecting method data logs and metrics to start to detect issues in the data using machine learning. Based on what you see with various data products to be able to find anomalies of freshness volume, dimensional changes in the data or changes in mean. Then the tools to actually resolve those those problems in the data through automated lineage impact assessment and various tools for root cause analysis. And finally, to be able to prevent data incidents. We have things like circuit breakers and schema change notifications that help in that regard. So, as I mentioned, probably about a third of our customers are involved in some form of data mesh we have very large customers like Roche diagnostics, who are kind of the poster child for data mesh and where we're supporting that journey for them. Thank you. Thank you so much and thanks to. And if you have questions for Shane or about Monte Carlo again feel free to put them in the Q&A portion is he'll be joining us as well as the Q&A at the end of the webinar. Thanks to both couch base into Monte Carlo for sponsoring today's webinar to help make it happen. And now let me introduce the speaker for the monthly series Donna Burbank Donna is a recognized industry expert in information management with over 20 years of experience helping organizations enrich their business opportunities through data and information. She currently is the managing director of global data strategy limited where she assists organizations around the globe and driving value from their data. And with that, let me give the floor to Donna get her presentation started hello and welcome. And thank you and thanks for the Jeff and Shane as well. Some really interesting presentations and always a pleasure to join these. So, if anyone is new to this session and hasn't joined us before it is a monthly webinar. You can see some of the ones in the past if you've missed them. Dataversity keeps all of them on their website to catch all the replays and hopefully some of these other topics are interesting to you if you want to catch some additional ones throughout the year. But the topic for today is data mesh and kind of if you've seen the abstract probably because you came on to this webinar. There's a lot of terms that go around and I will talk more about that. There's data mesh there's data fabric right and, and I think a lot of these promises are this is the next big thing. And so I want to start, you know, again, what are we talking about when we talk about these terms because if we are kind of a business centric data architect words mean things and we're very. The semantics are important, but really more importantly in this some of the previous speakers kind of touched on how does the data mesh approach fit in today's modern data architecture. And then also, you know, how does it fit into some of the more fundamental approaches we've been doing for a long time things like master data quality and data governance which has been sort of top of mind and some of the other systems around data mesh so we'll talk are those founded are they not so hopefully as with all the other webinars I tried to make it real. What does this mean in the real world and if you're trying to make a decision and what are these words mean and should I use them and doesn't make sense for my organization so hopefully this will maybe not provide the final answer but give you some ideas or some food for thought. So back to the core definitions. This is data mesh and that's what we're talking about today, kind of the there the similar words kind of in some cases similar functionality but a bit of a different meeting. I think data mesh is more broadly not only the idea as some of the previous speakers mentioned of this idea of localized data management domain based authority of data, but it's also a more of a cultural and organizational shift and kind of a foundational thinking. So, you know, data assets are organized in data domains, subject matter experts kind of use those patterns and define them and create data products. And some of that is technological. There we go. That is technology. And some of it is technology and people so we'll kind of go through both of those. We look at the data fabric and we won't be covering that on this call but that is and it could, you know, a lot of people do have different definitions, but that really is more of a data integration style. You know, whether that is a knowledge graph or that is a virtualization mechanism, etc. And then again they're related, but we're going to talk more about the data mesh and it's proper form today. And I'm giving credit to the Gartner data dictionary. I use there's a lot they always tend to have some good definitions so I am a fan but data mesh data fabric, don't get me started. We have lakes now and ponds and data mesh made a fabric of data crochet and data needle point and who knows next, but for now these are the two terms we're using. So it kind of leads into this so I do want to kind of put it into what we're talking about today. So many of you may have heard Zamaq Dugani who's been a lot of the kind of the founder of data mesh. And I'm not doing it on ad hominem attack or anything on on Zamaq but we kind of do that in the industry of the Bill Inman model and the right the Kimball model and so in a way her name and she's launched a lot of this I'm going and also as there's been so many different definitions. I'm going to use hers because everybody seems to take a thought and then kind of evolve it so I tried to go back to source as I was going through this. Very well spoken very intelligent woman some of the issues I have is with language, and she's the first one to say. So this is from one of her webinars. I'm going to go to Felini saying language is a different vision of life. And then she went on to say is we need to create a new language, and that many of these concepts exist already but we need new names and I say please no, we do not need new names for the same thing. And I think one of the problems we have in the industry is, is, are we innovating or are we creating new names for things and I think we some of us in data architecture spend so much time trying to understand the names of things and the meaning of things for, you know, our organization or for a client or for consultants. And we want to be really careful about our wording. And then the other one is more and I'm going to maybe an event mode today, but we tend to do that in this industry of, Hey, there's a new product and everything that came before it is terrible right or, or do we do that just in a video that's old or that's not that's out of date or it's not modern it's not hip it's not new like what do those even mean so I sort of on a roll the other evening and I, I kind of listened to some of these webinars and just started writing downwards of what were words used for, I think a loosely defined old school which is sort of data lakes and data warehouses which mostly focused on data lake from a lot of the conversation and words, and then comparing it to this new paradigm of, of data mesh. And so in the old world data was a byproduct and not a product I mean that's clever it's nice where we should treat data as a product but I think I'd be reluctant to say that anyone who's doing a data warehouse or data like these data as a byproduct, you know, maybe, maybe you could argue it's being managed and efficiently you're not modern but certainly a byproduct, a first order product, or just it's broken it's disconnected there's tension is full of friction in my very favorite one, everyone is unhappy. That just seems broad, right. And if you look to the right, if you're looking at data mesh is the delightful experience was the paradigm shift with a lower cognitive load with a decentralized social technological approach. Great words, but I'm almost not sure what some of these means were convergent or intelligently empowered ecosystem apply order to value with polyglot update data points focused on architectural quantum. All great. I mean I get very very well spoken human being but I'm trying to get my brain around what concrete things these offer and there are concrete things I will go through that, but I did find a lot of word salad and just I, we do that in an industry are we just creating more buzzwords. So I want to kind of break that down and be a realistic of what do we what are we solving and what how much of this is new. You know, are we ingesting data or are we serving data. There may be some, you know, technical differences there or we just wordsmithing right that's a great idea we're serving things to our customers we're treating data as a product. That is true and to be fair I think that's a bit of what she's saying of, we need to think of data in a different way it's a first order product, you know we do so much of treating our products and our customers. We need to think of data that same way that I get that I agree with I agree that we should have a delightful experience with a lower cognitive load. I just I do think and what part of it again I'm not attacking so much because she's super smart. A lot of it is people take these definitions and create their own so trying to just really understand what are we talking about when we're talking about a new pattern, what it solves and what what some of the risks might be. I might have gone off on my own slight tangent here but felt they just say that because I just as we go through what are we solving and what words are we using is super super important. So I do have a rant. Let's keep it simple my my nerd joke here is issue obfuscation, which basically is a really fancy word of saying, you know, please don't use big words to overcomplicate simple terms right. You know, if we can if we can use a simple word that we all understand please keep doing that. I saw this in a bumper sticker a few years ago where near where I live where I guess there's a lot of nerdy people and, at least in my mind that cracked me up. But that I feel like we're doing a little bit of this are we obfuscating simple terms and and so let's issue obfuscation, please. So that said, it is very well. There are sort of sort of standard and again I'm using a full disclosure a lot of pictures and really trying to go back to her core principles, so that I don't mix it with other people sort of versions of there are other versions of mesh and some people mix mesh and fabric and all that so I just thought for this presentation. Let's go back to some of these four core principles that she came up with so the main objective is to create this foundation to get value from data at scale, which as the previous speakers mentioned is a problem we're getting more and more data so some of the four core principles are this idea of domain ownership, if you know folks are in a particular domain they're the best people to manage it and curate it and treat it like a product. And that data is a product it's not a byproduct again I agree with that it shouldn't be something you do, you know as an afterthought it should be the thought. And this idea of providing a self service data platform so that consumers of this product of the domain can see it easily and understandably and have to metadata and all the right things around it because this is a nicely just produced product just like you would produce a product to your customer, you know, you wouldn't do that in a sloppy way or and you want them coming back for more all of that that I definitely agree with that. And this idea of federated computational governance so you do have kind of that distributed approach to managing these data and you can see her graphic on the right that really kind of explains that a little bit more. So we'll kind of go through some of these. So again, this idea of domains is sort of core to a lot of what we're talking about with math and a bit of what a lot of folks are pushing back on a little bit and are we being fair be overthinking. But the primary principle is that data particularly analytics she does put a big focus on analytic data is is organized and managed in these ideas of domains. One of the summary she gave was like, you know what a domain is this is stuff that runs your business right we can all say that you can see things on the right, you know, marketing sales HR. Some for examples use something like a podcast company in this, you know, podcasts and media and things whatever your company is running those and this methodology are your domain so they could be an organizational unit of business function kind of a logical domain like customer, which is maybe something that's on your conceptual data model. I'll go into that more I, I find that a little bit loose, and but I'm a big nerd so take that as it is. This is kind of a loose definition of a grouping of things where you have an owner, and then each domain would have a formal data ownership and these are the producers of data and they are held responsible for those different data areas to publish it for others in that self service way that includes things like metadata. So you don't just put out the data is good luck to you, right, you do have to have that metadata and publish it wrapped in a nice bow for other people as well. And again, I do feel that's a bit of a loosely defined contact again when I think of these things I think what is the a marketing might be a department right and finance might be department. So selling might be a process and products might be more of a data area or a noun right so I think more important in this methodology is more, you know, these are the areas of the business where there's an owner owner around them. The other big area if you see the quote at the bottom, you know, that does get into the, you know, not only hit, you know, the idea of words mean different things but the idea is instead of flowing the data into this big centrally owned data lake that's your data platform. It's more of that service model where the domains host and serve their data sets and they consume the boy to publish out to other people right so don't build the one big monolithic application. Bring it closer to its source and the people who know the data are publishing the data so marketing or, you know, selling or distribution or any of these examples that are given. So that's kind of that core of owning the domain and then being a product owner in a way of that domain. So it is a data product and, and again in the database words is that architecture, I can talk today, an architecture quantum right it's the smallest piece of an architecture that can be independently deployed in this kind of hub spoke model sort of things so, in her mind it isn't just the data, it's the metadata is the infrastructure and some code to access it. And one of those polyglot output data points so you know maybe the API that someone can get and produce the data it's not enough to produce the data. You also need to make it consumable and publishable but the idea of this data product is that that smallest piece of thing that you can publish out. One of the nice things about thinking of data as a product and I've used this analogy myself is that there's a just like there's a product life cycle. There's a data life cycle, and not just you know when is it created and destroyed but you know how are we thinking of this from a data request. Why are we doing it. What's the value of this data how do I manage it how do I make it consumable and and something that people want to to use right and again it's that idea of ownership and and treating it just like you would treat your, your product, understand it as high quality is trusted, it's meaningful and people want to come back for more helping Corny. I think that aspect does make sense treat data. We all say data is an asset, and yes you manage manage financial assets. This kind of takes it one step further of, you know, well, we also even a financial company sells products financial products right we, you know product customer that sort of thing is a nice way to kind of put the onus on the producer and the manager of that data to really publish it out in the right way. The other idea is this idea of self serve. And again it's not enough just to build the data you want to build this that they will come and hide the complexity, you know typical self service even if you're doing, you know Power BI you probably understand that this is a bit different than that right it's hiding the complexity but in a consumable way that's almost your digital storefront that people can self serve from that platform, you know whether it's code, or infrastructure or you need to provide all of those different areas to make it consumable. The other idea of this data mess is that it is a federated approach. Again, this isn't this one big monolithic data set, although we'll get to that she does say that there's a place for that that in some ways a data lake, or data warehouse could be a node on the mesh. There's another, you know, not that you never do a warehouse, but that is a, a product that may be consumed by other people just like finance maybe a domain that's publishing out finance data right so this idea is not only again. The paradigm here is not just that it's a technical thing, but it's also a culture change and the data governance paradigm. So some of the, the, the issues that data mess was trying to solve and I have a bit of issues this because I don't know if the core fundamentals are right, but that traditional data governance is too centralized it stifles innovation and change. Again, all those things that those nasty words in the beginning that stiff. And it's, you know, everyone's unhappy, and nothing gets done. That's not a fair definition of governance right I think most governments do have many approaches whether it aligns with a business process or organization or there's a federated approaches to governance and this some things are centralized and I guess that's part of my issue is you're not setting up a core correct assumption of something you're criticizing right so I do think governance and partly because I do do a lot of it as do a lot of people and diversity you don't want your data governance to be stifling and too centralized and too stiff, but that wouldn't be a core definition of what data governance is right. That said I don't want to be unfair to data mesh either because, and again, I tried to go back to source and go everything I most things I published here were from is a Microsoft because I've heard a lot of criticism of mesh and I want to make sure I'm saying that she said right, I don't think she said no global governance and no standards right because that's not gonna work either right so it's putting putting the domain ownership in the domain, where it makes sense and then there are these sort of ideas of global governance and open standards and interoperability, and you can see some of the quotes here on the right, is that you know how do you, there is some effective correlation across domains and there are some other things across these polyglot domain data sets. In my, my favorite word is this idea of identifying poly seams across different domains, I guess for us that that's a word with different meaning what do we mean by customer, by a region. So we've been doing that for a long time is a great word. But that is some of the core things that the governance does right so how do we understand the same term, and I think that some of the criticisms of of mesh is and happy to discuss this because she doesn't say do none of it right she does say there are things that are kind of managed across domains but then you get into a little bit of a circle that is kind of that global governance right so I mean but let's not let words myth anything that are, you know, domain owned and, and some things are governing you know more of a global governance right. And I did call up a quote there these are all quotes from to her writing which was you know where does the data like your data warehouse fit this and a note on the mesh so it's not that you never ever build one. It isn't the only solution and I'd certainly agree with that I mean there's so more many more patterns out there as well. But I do have some issues with data mess as I have defined it or from some of the writings that I've seen from her. I do think the criticism of centralized data governor just data governance, where everything is region rigid. And everything goes through central committee, or some of the things I heard or everything you know centralized slows everything down. That's just not what governance is. I mean certain things need to be how do we define a total sales for the company that we send to the, you know, annual report, we kind of got to agree on that you know you don't everyone doesn't even find this doesn't define that that's a cross functional thing or how do we define patient data that goes across all the domains of a hospital right. And then if you say patients a domain then isn't that kind of a centralized governance right so I just starting with saying data governance is rigid and everything goes through a central committee just I don't think is a fair assumption of the governance there's many models. Some things are global some things are local, and then you have to just make that decision across governance of when it makes sense you don't want to under govern, and you don't want to over govern. I think there's a fear with with data mesh of under governing some of the things by, you know, are we creating more silos I guess and I'm not the only one saying this and we could be unfair right but are we creating more silos by these domain based you know where does that collaboration go, because that is some of the hard stuff you need to do to get the data right right and, and just by saying everyone manages something. Is that the other extreme right that's kind of the question there. You did myself that was great. Is this idea of data domains and I kind of maybe criticize me I'm a bit of a purist here but I'm a purist for a reason. When I say a data domain what do we mean is it a data domain like customer product like something gives you the data model. Is it a process is it an organization is it a business capability. And I think it matters because it matters to governance it matters how it fits in with the organization, you know what what it means to be if I'm, if I am the data owner of HR. If I'm a dead owner of sales, who owns customer about marketing, what if they're the domain owner of sales or maybe that some of these, you know, areas that she said what I actually forgot the word these poly seams that sort of go across domains but again that's that's cross functional data governance right so that kind of breaks some of the owning a product right. So, so some things make sense if I'm trying to say, I don't know, total marketing campaigns for customers make me that that's certainly marketing if it's total sales by customers that's certainly sales right so, again, it's a it's a both and I guess and I, and I think folks get mentioned by making an extreme either side maybe I'm doing that myself right but I do think I like to be very clear on what we mean by a data domain because that also fits into your governance model am I owning a data area. And I need if I'm if I'm owning customer I I better get kind of cross functional input from sales marketing support all the things that support a customer. But if I'm an org centric then I guess I would own everything around finance or HR right and that kind of has a different set of rules around it. So not the fact that initially domains or or that concept of kind of localized ownership I like to be a little clearer with what words I mean, and kind of just be clear because it has an effect on how you govern and what kind of data we're even talking about. The next one I have an issue with this idea of an architectural quantum, because to me that that's the lowest level architectural item, and I would argue that if data is the product data itself is the architectural quantum and I realize I'm totally nerdy with these words but you know your actual data objects a customer, or a product, or even the critical data elements within a data object like customer email or product number, right, managing some of that. So who to if we need to change a product number who makes that decision, is it only the product management team to sales have a you know this, you know, the shipment company have a have a say, right so I think you do that to me the idea of a domain is certainly not the architectural quantum. And I think that sort of that's where some of these issues I think it was these cross functional data areas kind of kind of break down because this is some of the hard stuff you really need to focus on. And again, a lot of the work is this kind of an idea of an analytical focus, and that is a lot of the use cases of kind of publishing out for analytics and that sort of thing, especially when we're talking about data governance. True ownership and accountability I think does come into the operational data where who's creating it, who's managing it and who is kind of accountable for the data quality and you want to be really clear that those are different things and maybe related and focusing as much on the operational data, which is kind of why when I look at the data domain. I also look at process who owns the process around entering patient data and who touches the data is each space age of the process or product data. And I think having that view across, and yeah that's kind of an enterprise architecture focus which maybe folks consider old fashioned I consider it fundamental because it really gets down to really understanding the business and yes absolutely you don't just have one big committee that manages everything. Aside from that there's a whole lot of nuance of how you get that right that it is federated federated takes you know some thought, how does the company run is it by business process and we need to have owners for that is it really about business capability is it by data domain that someone really can own and product data and publish it out, or pieces of product data right thinking through that is kind of complex and I just thought there was some loose thinking around some of these concepts. The other area is the socio technical to use the mesh words approach, and I can tell him bitter about this, that a lot of the words and it wasn't just one webinar that I was listening to her one paper that she'd written I really trying to be sure of this. I've heard a lot of these words enterprise data management is full of friction, and everyone is unhappy. I'm not unhappy. I do enterprise data management. But here's my what where I pushed back with that friction is a good thing. It takes friction to let a fire to use my own corny thing right. If you're not having these hard cross functional conversations to resolve these poly seams or architectural problems, you're simply avoiding the hard issues right and I'm quoting another data versus the speaker of Karen Lopez data is representation of the real world if you want your data to be simple make the world simple and get back to me. That is it too easy. And I'm open to discussion that's why we're here and I will lose leave time for Q&A. You know to say, are we recreating the silos right right that would be really nice if finance own finance and sales own sales and marketing on marketing, but then where does customer fit what do we how do we define a customer and yes we, we sound crazy and it takes a long time to define that but that's because it's more quantum, quantum of our business, and that friction is there for a reason, right because people are using it differently and if you don't have those conversations. You're going to just find these problems later right. So, yeah, friction is a good thing if you're in a company and no one ever questions anyone in a meeting. And that means that companies working really well, or really hiding things like a couple that never fights or never disagrees in a relationship. Is that healthy, or things being avoid maybe maybe everyone disagrees with everything. I think that's pretty rare so I think the idea of friction is a good thing and data is hard. So you don't want to hide that you want to promote that and call it out. And that doesn't mean one committee makes that this isn't to avoid friction, but at the same time you don't want to over federate either. That's kind of the art and science of data governance. So, we in our practice spend a lot of time with data governance on my own little soapbox here but there's a lot of ways to either manage it to manage data governance if we're talking about the governance aspect of mesh. Sometimes this process owner, you know, who's looking at data from the order to cast perspective or the customer onboarding, you know, or customer onboarding versus court customer payment versus customer support right. All these different areas touch data in a different way. Great example we just had was a big manufacturing company and they had a lot of issues with their data and payment terms, for example, would be changed and they weren't sure where but some people had 60 day payment terms. Some people had 30 day payment terms for the same customer and you can imagine trying to forecast everything. That was a huge problem. We kind of mapped out the business process. And the people in the different swim lanes are different. Maybe those are your domains. Kind of said, well, you change it in this part of the process. I change it downstream here. I didn't know you were doing it. It really was just the process of war and conflict so that it's fixed in a data problem by looking at process right. Sometimes it really is a system since centric that the next one the second one in not a fan of that I don't think it should be a technology centric but sometimes something like people soft is so embedded in your organization you need or these might be your technical data storage right sometimes it is by who owns customer supply or maybe your maybe something smaller like location or regions or or I don't know sales hierarchies maybe that is owned by sales right or regions things like that sometimes it is a domain centric approach. It is by or who's looking at sales who's looking at a mega sales versus us sales versus not in the mega sales. That might be an org centric approach. And then I like to look at also just capabilities or exchange capabilities don't one of the core capabilities that run our company whether supply chain finance sales, which is different than the names of those or how they might be organized. And really, in real life is generally a combination of all of these who looks at customer across supply chain across order to cast. All of that so that that's what makes it tricky, and that is part of the friction that makes it hard. And just throwing out there this is not the only way either. But, I mean this idea of having kind of domains of data that aren't centralized is kind of a I'm going to kind of a more a very foundational traditional model that can also solve that right it doesn't need to go to a quote match like good old fashioned. You don't have to hate the enterprise data warehouse right, you can have this idea of a centralized maybe this is one of the nodes on the on the mesh. What is I'm a university right what's the total number enrolled students we kind of need that because it gets published to us new general report and whatever that's like we cannot everyone cannot decide that that is maybe your centralized committee that does that. But you don't tell graduate students in their mart of, you know, what's the total number of graduate students, or maybe you want to know for faculty research what are the number of publications per student by faculty right or maybe you don't even want it in a relational database maybe it's kind of, you know, more of a time series data to do for institutional research what what success factors help students over time. Right so not everything is not relational database or even a dimensional, or it could be flattened out it could be a graph it could be a lot of things for this idea of. We have some centralized analysis and some localized analysis is kind of a thing that's been around, but I think what's missed also often and mess is this idea towards the bottom of these mass these core quanta I would say I would think these are your your core. architectural quantum things, your master data or some of them at least your master data your reference data that you have, you have student data you have location data you have faculty once you you sort of manage those as first order things. They're your usual components, whether it's in your enterprise data warehouse, or your, your data mark or the source systems right so this idea that I somebody is looking or a team or a domain of people is looking at student. You know, can push back the source systems that when I'm in my registration system it's the same student across. That's hard. There's a whole lot of friction doing that, but that really gets the kind of that data quality and there's data governance and accountability across across all of that and within all of that. So of course the faculty research team is going to look at their faculty research they own that they're the domain owners of that. They're not the domain owners of total number of rural students for us. That's the most important part. You know, you, and no one can tell you institutional research how they want to format their data for longitudinal data analysis right that their decision. So I think by nature, yes, there are localized ownership, there's some centralized ownership, but there's also some core architectural quantum that always gets left out of the conversation which is the data itself how do we manage student how what do we call a location. Is that a region is that a campus is that you know all of those friction words need something if we don't agree and have that friction of what we need by location. The facilities person thinks that's a building and the enterprise people think that's a country they wants to how many foreign students and graduate students thinks that's a campus right and we're all just talking different words for different things and that's what you want to avoid. So that's this a bit of this idea of federated isn't new and this maybe not be a federated approach right with this idea of localized knowledge and it's kind of centralized knowledge living together. I'm fine with but you still have certain things that you need to have as those core. Think of the data data is a product that I agree with, but I think it's at a lower level. It's and either at the measure level who wants to measure, or who owns the core data itself, or, etc, etc, or it could be, you know, streaming data and other things. I mean other example is this idea of public open data sets this is a great example of data as a product and people owning it and publishing it and treating it as a product right. You know the Department of Agriculture in the US has the fiends feed grains database. I love the pond database. What's the grain of the grain database. Barley. All right, nerd joke. Anyway, but but they, who am I to say, you know how many grains were produced each year by corn by grain by barley right they own it they publish it they produce it they they provide a seamless product for people to consume in different ways. It could be a node on one of your hubs external data can be it too. So, again, this idea. It's not a bad idea. There's other ways to look at it as well. And I think treating it as a product is a great idea. It just doesn't always need to be mesh. All right, I'm on my final right before I go. I do get sort of bitter of folks saying well you know data work enterprise data management is hard and they've failed so many times over the 20 years. And then this new thing has none of these failures. Well, you know, Michael Phelps the US world metals who has lost more swimming races than I have I don't ever bank have never lost a swimming race. I've also never competed in one. So, I mean the fact that they were housing and enterprise management been around for 20 years says something and yes there've been many failures that have been many successes right 90% of all businesses that start up fail. What does that mean we don't ever start businesses or starting a business hard right. I don't think we necessarily want to get away with the idea of starting a business so yes I'm coming full circle and inventing a little bit. But it's not just mass a lot of folks do that well we we don't need a warehouse anymore we have real time data streaming completely different use case. Real time data streaming is great for the right use case doesn't mean you have to kind of smash what was there before. With that final rant. I do want to kind of stress that yes data mesh does have some core principles of data owners domain ownership, having data as a product. I having this idea of self service and this idea of not only federated compute computational governance, but also federated kind of ownership and governance which is a big part of this idea of mass again it's a socio technical approach it is more of a technology. Maybe that's more of a data fabric it is this idea of thinking differently about data, promoting data as a first order thing and we're working together with these idea of data masses, the data domains to work together. So again, that those are the great there's some definitely some good pieces of that I do think there's some misalignment with some of the data management core principles like data governance best practices let's at least have common idea what we mean by governance, and it isn't a big thing. Things like master data management I think should be part of the conversation of data as a first order truly as a first order product the product is product data or customer data or region data right. And, and I do think this idea of you do need a certain amount of enterprise focus to create that cross functional friction. Everything shouldn't be friction if it's truly your data, you should be able to own it totally agreed. But don't forget that some of that friction is good because those are the hard parts of the cross functional data integration that we really need to run the business. So, with that I did leave time for questions. I want to open it up to Shannon, while we do that just a little call out to next month's webinar and reminder if you need help we do this for a living. So over to you, Shannon. Donna, thank you so much and invite Shane and Jeff to join us in this Q&A just diving right in here. Lots of great questions coming in around this so isn't this similar to quote unquote Spotify model of agile where organization is done using domain squad product area, etc. And that domain squad product area is responsible for the data end to end. I think that's a fair assumption. In fact, one of the examples to use is kind of a Spotify type company, right. And that, and that may make sense of that you are the owner of this domain you're the product owner of this domain she does use some agile type terminology like product owner. And yeah, I think in my mind that's a that's a good analogy but open to what Jeff and Shane think as well. Yeah, I'll just add this is Shane. I think the Spotify organization model of aligning cross functional teams against sort of domains and objectives is almost like an necessary operating model to support something like a data mesh because you need to have the, the data, the sort of full stack data of the talent embedded in the different business domains. And so I think it's an all model like that is definitely supportive of this approach and probably a necessary enable. I'm going to say yes. Is it true that data mesh hasn't any guidelines or techniques for achieving semantic interoperability. Yes. And let me go back because I always mess up that word. So she, again, I, I don't want to miss state. Either. So, so there is a bit of both metadata, I mean they get to some of the JSON conversation that Jeff wanted to have. So there's a format of how we can have common platforms and there is it's these poly scenes there you go there's a great word for an extra party. There are some common terms across in there and she does have a place for this global governments and standards to enable in October operability. I don't see that being drilled into as much as I'd like to see because I think that the friction right, but it's not avoided it's not like it's there at all so there is a place for it and this is how I saw it just described but open to Jeff and Shane. Okay, well I was going to say I think that that's, we're still just such a far away out, but the suggestion because I've been with many of the analyst peers that you know as well that I've been talking to about, you know, JSON as a vehicle to be this arbiter of what the data payload right you could you could say this customer record in this system looks like you know is called customer ID but in this other system it's called ID right that that whole notion of being able to having something flexible enough that you can keep modifying it to keep up with whatever new systems you're adding or whatever new analyses or other tools that you're trying to talk to. I think there's I think there's some room there for something like JSON as a structure for it, but I still haven't even seen folks tackle this to that big a degree yet. Yeah, Shane your thoughts. Just that I'm seeing sort of two flavors of data mesh across customers. One of them tends to be where they're making the core kind of data product unit essentially at a table or a set of tables in the data warehouse, and they were slaying through that semantic interoperability. You know how data products have to be kind of approved in order to be part of their mesh framework and so putting them through a set of standards around where particular fields need to be interoperable. I mean I've seen another end of the spectrum where people are defining more holistic data products. More like the microservices movement where they're essentially building data product containers and making those containers kind of portable and interoperable, but that that feels like the harder path to take right now and where there's been very much sort of both. You know thought leadership and progress technically. Perfect so what if an organization over the past years has established their data management capabilities and even organization according to the data model DM box, does your data mesh fit into data thinking or what it mean a paradigm shift. I've been going to channel Zmok, and I'm sure she would say paradigm shift because she loves that phrase. I'm teasing you now I don't even know her is a paradigm shift of socio. She has a lot of great words. So I think part of it is a paradigm shift of thinking data is a product. I, again, I'm, I'm a bit skeptical but you can say master data management I guess by some of the definitions is a is a data product or they did more or try to or it might be a way that folks are on the day my DM buck model me I think there's some mapping to the concepts I mean I think even. Demox said that, you know, we should we're renaming some of the things we've been doing for a while it's just more of a new way of thinking right so you know maybe if you do have too much of a rigid data governance is maybe more of a federated data governance approach. Something to look at so yeah I think a lot of this is kind of. Here we go meshing things together that we've used before, but I do think. Yeah, there's some new approaches that still need to be tested I would say back to my Michael Phelps analogy. Sorry, I'm interested with Jeff and Shane have to say. I think you're right. It is paradigm shift and we're still looking for you know a variety of different paradigms I think. Yeah, I'm not familiar with them but it sounds like a paradigm shift. I'll agree. Plus one. Correct. So, do you have some examples of technologies used to achieve successful implementation of data mesh. I'm going to practice past the product guys first on this one if you guys want to take the first one on those. Let Shane, Shane go first because my observation is it's still an amalgam of all kinds of different stuff right now and you know even when I'm looking at my partners in the cloud and things like that but there's so many still moving parts in this in this whole thing. Yeah, I'd say typically when I'm seeing customers go down this path, the sort of foundation is the cloud data warehouse you know that the snowflake BigQuery data bricks etc. And then on top of that and obviously you have the sort of transformation and ingestion and and scheduling tools like five trend dbt airflow. But typically the layer they're adding on to enable data mesh is both the catalog. So something like elation or Atlanta. Some of the common ones is one called data dot world, but some of these newer catalogs that enable data discovery across a wider organization. And then I'd say observability tooling, you know, given these principles around a data product are discovery trust interoperability. Often it's the catalog and the data observability tooling that that they're using to sort of build out that standard for data products. Great. No, I mean, I think that's for me. I think some of the nebulousness is what makes it hard right that what I think we need to agree on what some of these core terms. I mean, I think some of the, you know, we didn't talk about data fabric, but I think some of the data virtualization and those kind of technologies are also kind of a component of this that it's actually even stored in one place but you can kind of tie into these different products to get the full view. All right. Well, we've got four minutes left. I think we have time for at least one more question. So a problem granular should a data product be for example for cross the domain address products. Should those be considered products in a supply chain? Cross domain address. I mean, well, part of my, my issue, my, my little mini rant I had there that I think that concept of domain is too loosely defined. I mean, in some case, many of the examples that a domain was an entire. At a business area, like finance was a domain, right? And some areas customers a domain, which to me feels like master data and that that's not one of my issues with that that architectural quantum. I'm sorry, I have problems with that word. The architectural quantum is too high grained. It just like finance is the whole product like what I can't even imagine what that means. That's the domain area. The single thing that maybe a product is an order, right? Or a customer or a vendor, right? To me, that's how I think about it. Again, I'm not a data mesh person. I'm going to channel it here. But that's one of my issues. I think it's way too loose in terms of what that domain means. Opening up. I think that's correct. And that, you know, there may be a quote unquote owner of your order or of your customer. But it's, I think the granularity of, you know, of where that domain kind of exists is is lower than the finance department. Right. There's an order. The finance would own an offshoot of that, like the invoice. But yeah, I think that the terminology still does need some more refinement of, you know, at what scope do you, you know, are you applying it to? Yeah, I just say like, I think, you know, unless you've got a very simplistic business, most people are defining some source aligned domains so that might be, you know, commerce or advertising or something like that. And then they're defining more kind of cross domain products and consumer aligned domains. And so even in my time at the times, you know, subscription data was a cross domain product that then fed that was managed centrally and fed into a lot of downstream products. And so I do see most teams still managing some central data products for the organization, even as they're pushing into these mesh like framework so that very few times have I seen full decentralization of data product development. And I think even in some of the examples and early message was something very simple like a podcast company or someone owns, you know, web clicks and someone owns podcast publication, you know, it is hard to find a use case where there isn't a cross functional and they do exist. You know, maybe it isn't, you know, web web traffic clicks. I don't know. I think that as it scales is a little harder to define something that's not cross functional. All right, Shannon, do we have one more? Are we done? Can you tell us? And I think that's it. That's actually perfect timing. I know so many great questions coming in. But that is all the time we have for this webinar. Thank you so much to catch base and to Monte Carlo for sponsoring today's webinar to help make these webinars happen. And thanks to all of our attendees who have been so engaged in everything again, as we do. I love all the chat and the questions that have gone throughout the day. And just a reminder, I will send a follow up email by end of day Monday for this webinar with links to the slides and links to the recording. Thanks y'all. Hope you all have a very good day. Great. Thank you. Yep. Thanks Shannon. Thanks Donna.