 Hello and welcome. My name is Mark Horseman and I am a Data Evangelist with Dataversity. We would like to thank you for joining today's Dataversity webinar, the Rules of Data Stewards. It is the latest installment in a monthly webinar series called Data Ed Online with Dr. Peter Akin. Just a couple of points to get us started due to the large number of people that attend these sessions, you will be muted during the webinar. For questions, we will be collecting them via the Q&A section. If you would like to chat with us or with each other, we certainly encourage you to do so and just to note the Zoom chat defaults to send to just the panelists, but you may absolutely switch that to network with everyone. To open the Q&A or the chat panel, you may find icons for those features in the bottom middle of your screen. To answer the most commonly asked question, as always, we will send a follow-up email to all registrants within two business days containing links to the slides and yes, we are recording. And we will also send a link to the recording to this session as well as any additional information requested throughout the webinar. Now, let me introduce to you our speaker for today, Dr. Peter Akin. Dr. Peter Akin is an acknowledged data management authority and associate professor at Virginia Commonwealth University, president of DAMA International and associate director of the MIT International Society of Cheap Data Officers. For more than 40 years, Peter has learned from working with hundreds of data management practices in more than 30 countries. Among his 12 books, all of which I've read, are the first on making the case for data leadership, the first focusing on data monetization and modern strategic data thinking, and the first to objectively specify what it means to be data literate. International recognition has resulted from these and an intensive worldwide events schedule. Peter also hosts the longest running data management webinar series on dataversity.net. Before Google, before data was big and before data science, Peter founded several organizations that have helped more than 200 businesses leverage data. Specific savings have measured at more than $1.5 billion. His latest venture is anything awesome. And with that, let me turn it over to Peter to get today's webinar started. Hello and welcome my friend. Thank you, Mark. It's always good to chat with you on this and welcome everybody. It's so good to be here with you on this wonderful sunny day on the east coast of the United States. It's always good to have good weather and hopefully a really fun topic for us. So, good, jump in and talk about sort of three areas today. One is the rationale, the motivation, why we need data stewardship as a role. For those of you that are graybeards like myself. You know, it was not a role we had in the past we have some subsumed all of these topics under something we call data administration and management. But it has clearly become important enough that we have a specialization books written on the topic people debating it frameworks maturity frameworks, all the rest of this thing we're going to do pretty basic today. We're going to talk about why we need stewards. What are they supposed to do, and then how to go through and assign stewards in there and the goal should be on focusing on tangible improvements in the organization. Now, normally when I start these roles out and particularly around events that markets associated with I would typically ask the question. Many of you are starting versus restarting, and the answer seems to be about 5050 data governance programs have historically had a hard time getting started and keeping focused in there and I think the main reason for this and I haven't aren't familiar with me at all. I'm a musician as well. And it's one of the things we do at these events is have lots of fun musically on the side for these things but it's getting us all in the same sheet of music. Now, I'm going to show you what I consider to be not a good way of describing a data steward. Certainly, as an evangelist market describes himself that way and I certainly fall into that category as well but if I go in and say, the one person at work you can't live without, you're going to get some pushback and the reason for that is because we have been working without data stewards for a very long time and so people just do not understand. When you say this role is so important and why so we have to have some justification around and let's start out with some definitions around this when you look up the word steward. Someone who is employed to manage another property house as large as state. In that sense, and certainly that's a good topic in there so then stewarding would become the process of managing or looking after and therefore a data steward. We can see that some good definitions under their manage data assets on behalf of stakeholders in the best interest of the organization. That's a great definition by our colleague to net. I'm not sure everybody will understand what that means and that's why we need to dive in a little bit further represent the interest of stakeholders take an enterprise perspective and have dedicated enough time to be accountable and responsible for it. I'd like to emphasize two other qualities here though first one is trust which is that it is a position of trust that we are putting somebody in, and that's because it involves a fiduciary relationship that's a involving trust in relationship with a trustee and a beneficiary on this. So let's look at the word steward again just a little bit further on this one who actively directs is what it says if you go to Miriam Webster, which is one of the main dictionaries on here so a data steward then we could expand that to say one who actively directs the use of organizational data assets in support of specific mission objectives. This is really crucial. Why do they show up and what are we trying to do with them while they are taking the data assets and helping to focus them in a way that helps the organization achieve its mission overall. Another idea around this is a component that comes out of one of IBM's training packages in here and you can see these are the new data and analytics roles that are in here and sure enough we do have a data analytics steward and a citizen data steward that is listed on these but notice what they are is really not in that must have category so even the folks that provide guidance in this area are not yet agreed on what we're attempting to do with respect to all of these aspects of it. It's really clear that there's a bunch of confusion and it basically has to evolve back on to I'm going to put on my professor hat here and say that we're going to blame it on the professors of the world the academic of the world in here for years and years. Some people have been trained to believe that data is a business problem and the attitude from it stereotypically has been if they can connect to the server my job is done now both Mark and I have heard this phrase many, many, many times over the years so it's much more than just an imaginary thing on the other business looks at data and sees somebody with a title chief information officer and thinks who else would be taking care of my data and as a result data has fallen into this enormous chasm between business and it and it's our job as data leaders to help repair the damage and put things back in the proper way they should be done. One other real important aspect of this and I just like to get this out of the way first because it is so critical on this. I label this slide just say no, and the reason I say no is because many organizations will start out by saying data stewards than our data owners. Now, I quite frankly will not work with any organization that allows somebody to own data. It is simply not correct to say that this data belongs to an apartment or to an individual or even to a process. It is absolutely not the way we should do things. And if they do do them, they're introducing more problems than they are solving. It is disturbing though because we also at the same time want people to in fact be process owners. So, instead of being owners of data, if an organization insists on using the word ownership, I will try to restrict it to what is proper in this case which is that you can be the owner of the data requirements from the time they hit your well documented process until the time they leave your well documented process after all that's just a normal control that would be there and it's a very reasonable thing to do. But in order to do that you have to understand and have a map that says this is what my process that I'm governing actually consists of and having no ability to govern that data within that process does not bode well for your process of success. I was talking to Don souls be a very good friend this morning and he reminded me of something that he has been advocating a long time ago and I'll include it here for you all which is to say that if an organization is going to be performing at a level three maturity for example which is something that many organizations aspire towards and quite frankly not many are making at this point and I by the way I should qualify my things here and say that at these events that we're talking about people who are interested enough to go attend a talk on data stewardship and attend a conference around data governance, such as we have coming up in the next couple of months. That's really a subset of the organizations, many organizations have not even become aware of this so this is a way in which you can achieve a sustained competitive performance around this back to the just say no piece though instead I like to introduce the term fiduciary as I did on the last slide, and most importantly there speak of owning the data requirements because the data requirements are the things that you can control the most objective form of requirements the most testable form of requirements that we can have. And if you still need more information around that just ask the question, what data would the accounting function in your organization own. The answer is they get their data from a lot of other places around the organization so saying that they own that data would be completely on its head. The aspect of the confusion that we have, again, not on purpose but certainly consciously so over the last number of years is what we call data debt. And data debt is just not very easy to visualize but you can see it does slow progress decreases quality, it increases the local costs that are going on and presents greater risks to the organization. So I mentioned fiduciary here let's look at specifically what is the fiduciary relationship of fiduciary is a person who is legally obligated to act in the best interest of the group individual or company and this includes a whole number of people, lawyers, trustees, doctors, accountants, etc. And again my conversation with Don this morning he reminded me that each of those categories that are up there also have certain letters that follow their names and insurance and other categories that make them specifically qualified to be fiduciaries. There are some duties that a fiduciary has if you are engaging with a lawyer, the lawyer has a duty to act in good faith, a duty to take care of you as the client and a duty of loyalty to the organization that is in their financial and legal standards and ethical standards that they must adhere to more on this ethical component will come from a talk that my friend Karen Lopez and I will be giving in Orlando at Enterprise Data World where we hope to see Mark and a lot of others of you at the end of this month here. One of the most important aspects though as a data governance professional is to understand that stewards are trying to prevent this situation. Now obviously it's labeled the princess on the P, and if you're familiar with the story there's a deformity down here at the bottom here where I've circled the P and the princess as a result is sleepless well this analogy translates into us. If there is a problem with the data, or importantly the data structures that are in your organization, you'll have problems that will carry on forever and ever and ever because organizations don't go back and fix the flaws that are in the data model they instead program around them. So failure to understand the role of data stewards locks in these imperfections for life and we have to calculate how much is going to cost us to fix them it restricts the potential investments and leverage that you can get, and it counts for 20 to 40% of all it budgets, migrating, converting and improving the data and as I said before means that bad data practices cause everything else to take longer cost more deliver less and present greater data risk in here let me give you a very specific example that came from our time during the pandemic just a couple years ago this article on Forbes magazine that I'm referencing here I'm so sorry I went one further on that is the idea that American Airlines and United Airlines were both valued by the marketplace at respective values that you can see here American Airlines at $6 billion united at $9 billion on this but the data in their American Advantage program the a frequent flyer program and the mileage plus program was valued at more than the company. Now, you better believe that the CEOs of both of these airlines and in fact all their lines and in fact most organizations have a, not a whole lot of that they should trying to raise the value here I mean from the American Airlines one one might ask the question why doesn't somebody just buy American Airlines for $6 billion and then sell it again for $6 billion and keep the data, it would seem to me that's a very profitable transaction around that we're going to talk about a couple of specific sub topics that stewards are related to the first one is the role of strategy I mentioned before the goal of a data steward is to help the organization take these data assets and more easily or quickly achieve the mission objectives that is the goal of strategy and yet when you look at how strategy is done, it tends to start out with an organizational strategy, and then an IT strategy and then a data strategy as a component and complement to that. I've sampled Morgan Freeman a lot because he says it best. This is wrong. That is correct. And the way it should be is like this an organizational strategy should drive both the IT strategy and the data strategy but I contend in today's environment. The data strategy actually has more influence on the IT strategy than the IT strategy should have on the data in there. Let's talk about strategy very briefly. We didn't tend to use the word strategy much before 1950 when the management consulting industry figured out that they could take a term that had been previously only understood well in the military and make a lot of money on it and I'm being a little bit I think the less it is true. Many organizations spend lots of money and end up with thousands of PowerPoints and hundreds of pages for master plan in order to do this and the problem with that is that management has learned over the years that strategy becomes a thing it's the document that we have The military term, which is where the term originated, was actually very different. The definition they use in the military is a pattern in a stream of decisions. And that means that strategy is more of a process than a thing. Let me give you three brief examples. First of all, Walmart previously had a strategy that said every day low price. Very simple, very well executed. If you're on the airplane headed to Bentonville, Arkansas, somebody will say that on the plane. Customers understand this. Bender suppliers everybody of course all the employees understand that every day low price is critical and most importantly, when an employee at Walmart is making a decision if they can defend their decision by saying it supports the strategy of providing every day low price. They're very rarely brought up on any sort of disciplinary action. A second definition of strategy or second, excuse me, instance of strategy will go to hockey. So any Canadians on a for Wayne Gretzky right and his strategy is that he skates to where he thinks the puck will be. There's a great follow up on this. If you go to the Wikipedia article on this, he talks about conversations with his father, but think about it for just a minute. If I am chasing a very dense piece of plastic around the ice that travels much faster than I can skate. Chasing it is not a good idea. So you need to go to where you think the puck will be. And that's how you become the greatest scorer in the world. This example here. A third one is a little bit more involved. Napoleon facing a larger army at Waterloo. Some of you may have heard of this. The question comes up how do I defeat the competition when their forces are bigger than mine. Now, let me illustrate on here the Napoleon troops are the French Corp in blue and showing against the arrayed British in red and Prussian in black that are there. And the answer, of course, is divide and conquer and the strategy is still taught in at least the U.S. strategic command as an example of good strategy. Now let's look at this just a little bit further and see what divide and conquer means. One of the things that Napoleon observed and the reason this is taught as a beautiful example of strategy is because Napoleon observed that the British were supplied out of a stand whereas the Prussian were supplied out of Liege. You can see these two towns are on the opposite side of Belgium in this case. And the thought was if we could divide them, if the French troops could hit them at exactly that spot, boom, they would fall back. Now the reason they would fall back is because when you are in fear, most people will tend to flee towards food and safety and that is what a stand and Liege were understood to be by the troops. So if Napoleon could hit them at exactly the right spot, the troops would fall back, meaning divide and then of course conquer, which means that now they can all turn to the right and defeat the Prussian troops. So once we've gotten rid of all the Prussian troops, then we can turn to the left and defeat all the British troops. Again, great example of strategy, only problem if you know the answer to this is that it didn't work. Now, why perhaps didn't it work? Well, first of all, again, hit both of the armies in just the right spot, so bam, got to hit them there, hit them hard, and then turn to the right and defeat the Prussians and turn to the left and defeat the British. Well, oh, and by the way, do this while we are even being shot at in a large live ammunition situation. Everybody okay with that being a complex strategy. And yes, that was the reason it was tough. Do you see here now why adding lots and lots of pages and PowerPoints to your strategy is probably not a good idea around this. Let me give you a very specific data governance environment. This was created for one of the states and they were coming up with what is actually sort of a kind of traditional approach. They've got a group that advises the governor, advises the CDO and oversees the council and has lots of other duties around here. Once again, do you agree that this is a complex data governance environment and from a sustainability perspective, this one did not was not able to be sustained over the long run. We have 50 wonderful experiments here where different states are looking at different approaches to data governance and different approaches to stewardship around these. Again, many states are doing better than others and I would urge you to look at what your own state is doing or your own government if you're not in the US in order to look at that. But let's keep in mind that strategy is a pattern in a stream of activities and most importantly, it guides workgroup activities. Next component here that we want to dive into is governance and architecture in this and corporate governance is something that everybody understands whether you're in the private sector or the public sector. We do have governance that is important and people have been looking at this topic. In fact, a very interesting piece just from five years ago, Jamie Diamond said maximizing shareholder value can no longer be considered a company's main purpose and there are 100 or more CDO, excuse me, CEOs that have signed on to this strategy. So governance is clearly critical. Then, of course, we went to it governance and looked and said, well, aligning it strategy with business strategy is probably a really good idea. We should have some sort of measurable results around this and should figure out what are the key questions that we're trying to look at and then aligning around. And this is the way it's been done in an IT strategic alignment value delivery resource management risk management and performance measures. So now I'm going to present you with seven definitions of data governance and I don't want you to read them. I'm certainly not going to read them to you. And I'm going to critique these because I want you to imagine the standard elevator pitch. Now, if you have never had an elevator pitch, it's the idea that the elevator get on to go up and the boss come in and looks in. He looks over and reads my badge and says, oh, Peter, you work for me. Can you tell me what this data governance is all about? And I contend that the CEO is not going to understand most of those definitions that you have over there. Data governance needs to be sold in a very explicit fashion here. The idea is in my definition of data governance is managing data with guidance. Now, that's a very helpful definition. I've had a lot of groups over the years adopt that particular definition. I say to them that that's a good definition, but the higher you go up in the management chain, you need to alter this just slightly on this. First of all, if we're managing data with guidance, that means if we don't have data governance, we are not managing data with guidance. And that's probably not a good thing in most organizations minds. But you also need to manage data decisions with guidance and the idea of bringing on a new system without fully understanding the data implications can be very problematic for organizations. Let's take a look at the idea of architecture and guidance here as stewards look at it. So they're going to be sort of four components for types of just for types of individuals making up this piece. Leadership participants, experts, subject matter experts, others, which were just not really important and stewards in this. And so you can see some of the various assigned duties notice it is all on a basis or framework of it and systems development and that's absolutely crucial. Many organizations will draw a line around the left two boxes on this and say this is our data governance organizations. Again, there are no right answers. There's only an answer that works for your organization, but let's look at the roles given this type of a framework. The leadership should be responsible for making sure resources are available and incorporating data and feedback into their decision making process. And the results of decisions then are entrusted to the stewards to implement on behalf of leadership in support of the data for the organization. There should be some action taken as a result. There should be some changes resulting from the stewards actions. Again, more data, more feedback, more ideas that come in and guidance around all of that will give everybody a better approach to what's happening in here. Because architecture really is about three things. And again, the first thing of the three things is a thing. Architecture is about things. It's about what those things do, and it's about how those things interact. And if we understand those pieces, what we end up with is a common vocabulary expressing integrated requirements that make sure that the data assets support the organizational strategy and say primary form of collaboration among data governance professionals and between governance professionals and technical and business professionals in here. We want to avoid the tower of Babel syndrome and instead make sure that we have understanding which also is a prerequisite to interoperability so that our data and our data requirements are documented and articulated as this digital blueprint sharing, in this case, information between business users, technical personnel and systems people. Once again, absolutely impossible to do unless you have a trusted catalog. Again, whether you call it a business glossary or data dictionary or whatever you call it. I had one group call it a data bank in order to do that. And from an architectural perspective, if you are in an organization, you already have a data architecture. So architectures are here. That's the way in which organizations function. And of course data architectures have to be here as well. We have to be well understood in order to make them useful to organizations. So now we have an idea of what data stewards roles are. We can now talk about what they're supposed to do. There's a big gap in data and information. This was drawn on the board of many organizations that I've been to that says we are way too dependent on our human beings are wet layer in here between transferring data into information. And that's a big gap in organizations and we want is have the stewards focused on a single united purpose. So it's generally helpful to do this within a context of a data governance framework I have on my website. Many examples of data governance frameworks that we've used over the ideas but the framework is a means of guiding analysis organizing data around it discerning priorities, assessing progress. In all sorts of other activities. Thank you john Zachman for of course the original work in this area, for example, if you're building a building don't put up the walls until a foundation inspection has been passed. And then after you get that, but the roof on as quickly as possible so that we can avoid inclement weather. In doing research for this webinar I found a really nice article on stewardship in general which is, I think, very useful for what we're doing, showing that personal mastery vision mentoring components valuing diversity of opinions, coming up with a shared vision, understanding how much risk taking and experimentation are to be tolerated by the organization. What sort of vulnerabilities that we have and awareness raising in this finally delivering results gives us this framework for stewardship and I find it very, very useful. Many organizations their next phase after they form them is to start doing training in these various areas. We take that for data stewardship then I want to just change it a little bit in here and say that they're going to be some organizational data challenges whether they're anecdotal or whether you've done measuring. Come up with it, it's still important to keep them in mind, looking at that and then you have a decision to make. Are these things strategic in nature and should we be addressing them with the limited resources that we have now some of them will be put into a bucket, and we will address them later. But the ones that we decide we need to focus on now should be entered into the stewardship engine and they're basically two flavors of this right now one is regulation. And certainly if you're in a regulated environment, that's a barrier you don't want to make mistake on, but also just general activities that will help us improve the way in which data can help us achieve our strategic objectives. Some of those activities will be reactive something bad will happen and we'll have to fix it. Again, the example that I use in other talks is that Nike had a little sort of oopsie a couple years ago where Zion Williams, Williamson sneakers burst on the first day he was trying to use them, probably not the best advertisement for a Zion Williams branded Nike shoe. That was certainly a reactive activity but also we should do some proactive activities to start to address the mountains of data debt that exists in many organizations. And then we can look at this in terms of monetary value and non monetary value. If you're particularly in the public sector that month non monetary value would be then talked of instead of a mission as try, excuse me instead of a money it would be in terms of a mission on this but over time the value that you have will continue to increase in here because the value that should be seen as the place of expertise and where else in the organization are we going to find people who are going to be focusing in on these kinds of challenges. The answer is of course it's been happening every organization and that's probably not a good idea it's like asking everybody to be aware forest fires well certainly we want to be aware of them but most of our things we actually specialize in so I like to say that the data stewards in an organization are called the divers don't remember this old TV show you can sort of find it on YouTube. The divers the guy can fix anything with data and duct tape in order to do this, and we also know that in the fire station we spent part of our time, certainly doing fighting fires but we also prevent time prevent fires from doing fire awareness and things like that. Again, it used to be the case that in this country believe it or not we asked to make some commercials public sector commercials to say, don't go to sleep in your beds by your smoking a cigarette I know that sounds like advice that you shouldn't need but we did need apparently to get it at one point in time. Let me also address the full time versus part time because oftentimes I'm offered 10 people, and they get to give me one tenth of their time and instead I say no I want one individual full time which is exactly the same as what you're offering me two reasons for this one there's generally low data literacy knowledge workers and management and asking 10 people, or even if you're a three people the third of the time, you're almost always going to result have better results from having a dedicated individual to this than a person who is doing it off the side of their desk. The reason for this illiteracy is quite obvious. Randy Bean and Tom Bavenport have done great job here again if you just click on that thing right there you'll see the rest of their surveys in here this is just a summary of them but the most recent results from 2023 show that for the first time we now have the majority of organizations believing that they're driving innovation with data but that less than half are competing on data and analytics or managing data as a business asset less than a quarter are doing data driven cultures and less than a fifth are forging a data driven culture on this but this is not the most important results from their repeated surveys in 2018 the question was asked, which are your primary problems in data governance and the answer of course you can see here was 80, 20 people in process were much larger component than technology problems and yet we continue to invest in technology to the exclusion of people in process here the numbers for 2019 2020 2021 2022 have I convinced you at this point that these organizations need to put more time and effort into this type of an activity 80% are people in process placed and where else in the organization can people go to find out more about solving people and process. These challenges the answer is data governance is the only resource that is dedicated to addressing these challenges because poor data manifests itself in multifaceted organizational challenges it's a root cause analysis is part of data stewardship. The challenge around data is that because data is a central shared resource, it is almost always experienced by our business users and our customers through some sort of filter of a business process or an IT system or a combination of the both, and only the root cause that we're looking at here can actually produce the idea that a single data problem at the center of this may in fact be the root cause, and that we need to really focus on consistent analysis and fighting it. Imagine if every workplace team tried to address these individually, having a data team that is focused on this a group of data stewards that gets better at fixing these problems is definitely the preferred way to go, as opposed to, like I said, having it off the side of your desk. Let's take another role at the level is taking a look at the role of how these things evolve over time. Typically when organizations start this sort of an asian feedback loop and there's some data leadership that eventually says okay what do I do as a data leader I need to start some data governance, and they understand that this will fix problems but it's kind of like being at the bottom of Niagara Falls and saying I'm going to fix the water quality here. It can get better over time but boy it's hard to do at the base of Niagara Falls if you're trying to fix water quality problems. So we also then discover that we need to do some things faster and this is data that improves as a result of focus. That's what we call active data governance improving the feedback loop starting to implement things maybe with just one full time data steward focusing on one aspect of the business, eventually growing out to the data community participants as I illustrated earlier and getting into the general One of the things that we have been good at in data but not as good as we should be is that we really really are good at saying data things happen but we have not been as good at about saying that when a data thing happens, something happens in the organization and that's something that leadership can be focused on to help make the risk of excuse me the return on investment much much higher. Keep in mind also that ROI also means risk of incarceration for many organizations in order to do this. So, practicing trying to put dollar values on these things is what's going to happen and the purview around all of this is what's going on in the idea of data governance the scope is very very broad, looking at all of these activities and trying to figure out how we can make them move. Finally, assigning data stewards here we really need to look very carefully here so again, my friend and colleague David Plotkin has written a very good book on this, I encourage you to get it but I also encourage you not to use it as a starting point. It's hard enough to explain to others outside, what do we need this role for. And if we go in and I have seen organizations do this where they want a five year plan and they want to know how many technical data stewards are going to have versus project data stewards and all the rest of these things. I would not take those words off I would not share this with anybody else because as soon as you ended with this money, you also need a steward auditor and a data steward manager in order to try and move these things through this is way too much complexity to go to zero to 100. In instant is absolutely not helpful to organizations this is why fully half of the data steward operations that I've experienced over the past 20 or so years have been problematic and challenged in a couple of different ways around this. Data decisions very greatly. Again, depending on where you're this is just a hospital that had, again, if they wanted critical, excuse me, clinical domain specific definitions there was a department clinician that would do this if you wanted the master file of IDs. There was an IT master data management steward or an SME. If you wanted to understand the lab codes you had to go to the lab director if you wanted to go to the pharmacy codes you had to go to the pharmacy director if you wanted to go to the order catalog. You had to go to a member of the IT staff. And in this case if you were to diagnostic and procedure codes you went to the director of health information systems. If you wanted to know about charging things you went to the director of finance well you can see here this is why we are so frustrated with at least the US system because it is so many different pieces that are chopped up. Our goal here is to unify across these things to come up with places where we don't have to go ask a person, but in fact have this information accessible so that a all of these people can share that same information, and be all of the rest of the organizational associates can also gain access to this should they become interested and as soon as they get to a place where they can go and ask question to get answers, they're going to come back and do it again. There's lots and lots of guidance around all of these things what do they do within the organization well again our general role is improve our organizational data assets value. To improve data is use in achieving organizational objectives too often organizations have simply improved the data and not really improve the way in which the people use the data, and that creates less than desirable activities. I would love to get some evangelist activities in here again you heard Mark introduce himself as an evangelist and I certainly like to introduce myself that way as well, and ensure effective data management practices but quite frankly these have been not well implemented by most organizations. Let's take another look at the knowledge skills and abilities for a data steward by the way it's courtesy of AI I thought that was actually quite a nice drawing from one of the generators in here. And here's a whole bunch of ideas that we should have on this this was from a project that we participated in creating if you will a series of governance activities for the state transportation system so the states have different membership organizations are part of and they created a grant around this to create some governance for them and this was I think a quite useful activity with very, very good results around that if you Google that term you'll certainly find it out there. I believe it's in the public domain on this. If we look at organizations as data machines which is a very good perspective cake in many cases. That means all of the inputs that go into the organization our data and all the outputs are data as well. And if we're not careful. We will have too little data governance on the right hand side too little misses opportunities and most of your organizations are probably in that category where if you had more more resources to be able to devote to this. You'll be able to come up with some better types of results in this. But if you have too much, it becomes expensive, and it becomes bureaucratic so part of the job of stewards is to figure out where is the sweet spot that's in there. And luckily there's something called interoperability, which we can use as the primary determinant of value in this to help guide us from a strategic perspectives. So we're going to start out with some of our data being understood, known, and some of it being not understood, unknown. Over time, we'd like that to change. Many organizations will get to a half sort of place. Again, I know that sounds kind of crazy but we have a lot of neglect a lot of data debt that we need to eliminate in order to actually achieve results. When you get to this state you're actually pretty good at it and can pretty much discard some of the bottom parts of that data. It helps a lot if your data stewards understand what we call system thinking system thinking best defined by saying that the only way to really understand a system is to understand the part in the context of the overall system. In other words, the forest and the trees at the same time. But what we end up with in many, many instances is organizations not really getting it. So let's just be very, very clear. I love having these diagrams. I think it's really important for organizations to actually have these as diagrams that they can explain this to others with organizational strategy and data strategy are related data strategy supports organizational strategy, it says, whatever you're trying to do strategically, you need some data in order to do it. And our job as data stewards is to help that process become faster, better cheaper and less risky in order to do this. A specific example of this that we did a couple of years ago was that way group of, it was an army and the army was deploying people so that's kind of an important thing we're taking soldiers and sending them to the battle. And we found out that data was slowing down that process and by improving data via some of these steward activities that we're talking about here, they were able to double deployment rates. And the manager pays attention to that. Yes. And when you come back and tell them data governance did this. That's where you get your brownie points in order to do this so that everybody understands lack of data governance cause deployment rates to be half of what they should have been or could have been given all of this scenario on that. And then we take the data strategy and data strategy kind of guides data governance activities. Okay, now that we've determined how we can take data and do more with our existing data in support of strategy. Now we can tell the data governance group, given the resources that you have, how can you best help create sustainable practices. In other words, what data assets do to better support strategy should be the focus of data governance and the feedback into the data strategy box there if you will the function is how well is that data strategy working. And we also look at what the stewards do. And what you see here is, I have limited resources none of us have all the resources that we'd like to have. So what is the most effective uses that we have of investments in data stewards once again, full time instead of part time if it's at all possible, because you will have a multiplier effect in there whereas you will have a degraded effect. If you have them as fractional components. The best way to support strategy is to articulate it in support of specific business goals you may use something like the smart framework or other types of well known strategic frameworks in there but the specific business goals. Increasing sales, decreasing returns, there's just all kinds of things that go into this and quite frankly you can see them in day to day life. I can tell you for certain that the Amazon new addition to its metadata that it maintains about all of its products. They've now started to flag returns items that are have a high return rate. And when you see a high return rate they're putting a warning label up on the Amazon website that says hey yeah a lot of people buy this and return it and you know why we can do that and it would be a whole lot easier if you just read a little more carefully, which is usually with the messages in that particular instance. Similarly, the reason that data governance efforts have gotten confused most often is because they are speaking in the different languages. And having a common language around metadata being the language of data governance is a much better approach and we want to take both of those and focus them in on data stewards. In order to make sure the data stewards have specific tangible business goals that they're trying to achieve because that way when they go back into the business and try to influence the business to do better with their data. They'll have a much easier task if they're speaking the language of the business. Similarly, all speaking the same language again metadata, kind of a critically important piece. Again, just because it happened this week I can't resist telling you the story. You may remember that our Supreme Court in the US issued a decision in Anderson versus Colorado. I think it was Anderson versus Colorado was the Supreme Court case and they were in a hurry to get that decision out before the Super Tuesday primary that we had that week. So it came out on a Monday so that everybody could get the benefit of that decision, but they issued the decision. Usually it's issued as a PDF file. They issued it in this case as a word doc. And the word doc had enough metadata in there that people were able to see that in fact was not a nine to nothing unanimous Supreme Court decision. In fact, it was a five to four decision based on the metadata that came out of there. But they had glossed it over and agreed to support it as a nine to nothing decision. We know the truth of this because we can see that there was actually a dissent written by Justice Sungema or in that just Google Supreme Court metadata. Anderson and Colorado and you'll see the rest of the stories in there. One last point on this chart, very, very critical point, which is the idea once again I said throughout the whole thing of a trusted catalog. For some reason, we have gotten out of the practice of even teaching students that case tools and things that do support this are in existence. And certainly there's lots and lots of good technology that you can use out there to support this but if you do not have that trusted data catalog that everybody can gain access to it is going to be a problem and you're going to spend four more time and effort doing a thing you should in order to make sure this works forward. Let's also talk about the limited resources that all organizations fall into. The question is, most people look at what they're doing and say, how should we manage this data, how should we govern this data, right, and that really is not the right question instead the question should be, should we include this data item within the scope of our current stewardship practices, because we can't do everything at once we cannot go through and fix everything. So a decision that says this will be valuable and more useful than that. Again, it's a trade off right architectures always about tradeoffs. It's very important to make sure that we don't simply try to say we're going to govern it govern everything and again and this is one of the things that we see in organizations if you've got five organizations and you need 5,000 of them. Don't take the five and give them the entire organization chopped up into five bits, put all five of them into one area, make some very significant movements and changes some improvements in there so that somebody else says, Wow, I really like what you did on that effort over there. How can you do more and the answer is, give me more resources. If I invest 100 and it pays back 1000 it should be a relatively easy decision for everybody in the organization to agree to either way though document why you either decided to manage that. You include that data item in the scope of your data governance or not included in the scope of your data governance activities, because all of these things depend on lots and lots and lots and lots of data flying around here's a couple of examples from just one particular organization, you can imagine a query that runs 30 billion times a day, you might have a question as to, Well, okay. 3029 838 million 518,078 times a day this query runs, and you want to go and examine that query and optimize it and the answer is yes, if I take a quarter of a second and repeat that quarter of a second savings 30 billion times a day. It adds up very quickly and becomes tangible in the minds of leaders who have to make these decisions because all of data processing if you will what we used to call it and what we still should call it. It really involves what I call a data sandwich there's three components. Data literacy component which most people are not data literate and certainly most managers are not data literate I have other talks that we can do to talk that or a book if you're interested in really getting into the subject on that they have an uneven data supply. There's an uneven use of data standards within the organization and only when we try to smooth these things out and try to put them in place where we can really have them work well together. Can we in fact make a well oiled machine because the leverage point is high performance automation 30 billion times a day that particular query ran, shaving a quarter of a second off it is a really good thing, but we don't know how good. Until we added all up. I had to get to a is a picture I took of a tea farm in India. When I was there for one of my wonderful trips that I went to go visit folks and again, wonderful thing but the farm in India had this sign that you see in the bottom right hand corner here over the cash register quality engineering and architecture work products do not happen accidentally. In fact, they cannot happen accidentally because of course accidentally implies not a knowledge of strategy course this applies to data as well. So quality data engineering and architecture work products cannot happen accidentally. And the reason for that has been the data has been treated as a project for the most part. Oh, yes, when the data people get involved. Well, no, the data people should be leading these activities, because data is a durable asset an asset that has a usable life of more than one year there are reasonable project deliveries deliverables for a development project that may be too weak or 90 day increments but data evolution is measured in years and we just have to get used to it. This is why most organizations have shied away from it because it spans much greater tenure than the tenure of the leadership of the organization because data evolves. It is typically not created. One of the things again about being 40 years in this field is that I can go back to organizations that I worked with 20 or 30 years ago and point out to them something they already know. They're still using the same data that they were when I worked with them 30 years ago. It is a much more stable component in all of our systems and it's quite frankly a prerequisite to agile systems development. If you are working in an agile environment and let's be very clear, agile is the best way we have managed to come up with to create higher quality software faster. But it is not a data topic you do not do data things from an agile perspective if you're looking at it from a stewardship perspective yes you can do agile data warehousing don't start yelling me about that stuff I recognize all of that. But the overall evolution of data is it is going to take years in order to refresh this and if you are in the middle of an agile sprint and you discover that a data requirement incorrect. The only alternative if you continue with the sprint is that you're going to create more data silos. Now don't worry. That means all of us are guaranteed employment for the rest of our lives. But because data is not a project if you know somebody that has a PMP, they will certainly be familiar with the material that's on this slide I'm not going to review it for you but there's a great difference between a program and a project. The message from the slide is your data steward program must last at least as long as your HR program. Okay. Oh, yeah, I think we're not going to need HR anymore everybody's going to behave and we're not going to need lawyers and policies and procedures know HR is a permanent part of our organizations. You have to make management understand that your steward program is also going to be an ongoing part of your organizational components. Because of most frequent question we get asked is, when is it going to be done. Never easy it'll be done when the HR program is done is basically the way to say it. So another thing to think about is stewards should be the ones that help promote this data centric thinking around this. Certainly all organizations start out with a plan and guess what a plan is data, and then they transform that plan into a series of activities that we want everybody who's down at the lower parts of the organization, working on it. So your organization is literally all about data. Until it's not just data, some organizations make things. Right, but some organizations and increasingly more and more organizations are not actually making physical things they are making data products in this. So what business is your organization in in order to look at this. The focus of this, let me just give you a very specific set of examples around this. There's so much focus around analytics these days and I'm not at all putting it down. But first of all, nobody really knows what the term analytics means it means data analysis just in case you have a question is just a fancier word for saying data analysis in this. But that data analysis always depends on the existing data management practices which are typically some sort of a black box operation for the organization, but everybody understands warehouses and March and data products and all sorts of things, which gets us into the various March and dashboards and other things that we have in order to produce these things now. And again, I had a group that we're working with during the pandemic that said some of the pandemic numbers were going up and down on two different charts, and they said well that's not a problem for us these dashboards are absolutely correct. And I said yeah but when the State Health Commissioner gets the message that coven is both increasing and decreasing at the same time, State Health Commissioner is not going to appreciate that in here so most learning and feedback tends to be a closed loop to the right hand diagram within the analytics practices where as you can clearly see, if we go back and improve the data management practices here it will feed better data into the organization all the way around. Of course we know this is garbage in garbage out. Another piece of guidance on this in terms of stewardship activities is that there's a typical four quadrant diagram where we're going to talk a little bit about strategy and there really are only two strategies that exist in the world. So we've proven this consecutively, sorry, conclusively over the years in this, that's either you improve your existing operations or you innovate. And if you're in quadrant one, the one I called it here, organizations typically don't have stewards we hopefully all agree that's not a good way to invest in it. And two will pick on Walmart again we understand them to be a largely very efficient and effective organization and many of their operations in there and they do a great job of it that should be a focus area. However, if you're an innovative company, again Apple claims that it's an innovative company. So they use data to create strategic opportunities. Right. Well, let's just think for a minute. Remember Johnny I have the erudite British guy that came on to introduce the iPhones during their first half of their life so far. And it's spin and it would be wonderful I want you to imagine telling him to be cheap. Right. It's just not going to work and similarly I want you to imagine the folks at Walmart who are really good at squeezing all their vendors and suppliers to try and gain every penny remember their goal is every day low price right that is an absolute focus and ask them to be innovative. That's not their bag. So, pick one. Don't try and do both of these at the same time. It also is doomed to failure but instead, use the money that you get from increasing the effectiveness and efficiency to fund your innovation activities and you will find a much better success rate on all of these things. Only one in 10 organizations has started on the data stewardship journey. And that makes it very problematic but again it represents a strategic opportunity for all of your organizations to actually get better at this quickly because the competition is probably not doing it. So we spent the last hour looking at why do we need a data stewardship role I've given you some definitions of stewardship data stewards and importantly data debt. We've talked about the role of strategy in that and that data architecture should be the strategic focus implemented by stewards in there what are they supposed to do well resolve prerequisite challenges, largely stemming from existing data debt. Within some type of a stewardship framework that shows that we can't do everything at once we have to prioritize things. It's kind of like a fire station model. Sometimes they're out fighting fires. Sometimes they're trying to prevent fires. And the stewardship role in the context of data governance is very, very clear. We need to have data governance efforts focused on a few things, including business terminology business goals and metadata around that. And then we can align the stewards so that they can start simply. There's a different cadence of stewardship. They again are typically brought in late and we want them brought in earlier in order to do that there are some foundational prerequisites that we have. And there's a need, especially at first for simplicity agility and the role of practice in order to do this. So, one last little bit as we head to our Q and a section in here there's unfortunately everybody else trying to figure this out. So we have data data management data governance data stewards and you know what most people here. So don't try and over explain this to the people that aren't interested look at it and simply call it a data program in order to do this because stewards own the results. They control the remediation process. There's a need for professional stewardship because the data is increasing and there's a lack of practice improvement. It's relatively new discipline. It has to conform to organizational constraints. There is no best way there is a best way for your organization and you who are listening to me today are the best qualified to determine that stewards have to be driven by strategy. They have to have direct management application speaking the language of metadata and focusing in on specific process improvements in here. We've got some upcoming events as we get ready to go to our Q and a session but again, also I hope to see many of you as I said before at enterprise data world the end of this month in Orlando we're certainly looking forward to it. Around this and certainly can see mark and hi and others of the folks that are involved in the diversity webinars on this mark we're back over to you for a couple quick questions and things as we dive into here. Much and hope there's a lot of great questions. So thank you everybody for the further questions here. Our top question. If there is no data owner who takes up the ultimate accountability for that data. Where does the accountability lay thinking in the context of a of a racing matrix. Absolutely great question. And the answer to that is that we absolutely have accountability, but it goes into that fiduciary relationship so as I mentioned before. This is a term that most people are not really familiar with in a business context certainly not in an it context which is where most of the questions come from. But imagine if you could talk to your lawyer and then that lawyer could go blog about what you just had with them as a private conversation out on the internet, you'd be pretty unhappy with that particular set of arrangements so these are trusted professionals. They have specific training in these areas, and one of the training is you can see a duty to act in good faith. So if you have a steward that is untrained, they're violating their duty, not of course on purpose, but you can point that out to management and say look they have a responsibility to do this and their responsibility is to see that data is used in support of the organization. So the question specifically was around ownership. Again, the problem with ownership is that people think that gives them control. It doesn't it gives them fiduciary responsibility so we can absolutely say and this is probably the problem that I see most often happens again, somebody will sit down and say alright how many stewards can we afford five stewards good. You take this piece of pie you take this piece of the pie you take this piece of the pie and you take this piece of the pie and good luck. My suggestion would be again, take the five of them put them on one project in one area, and say we're just going to look at how this works for this area. So we will absolutely define requirements for that data, and that the stewards would own those requirements that's part of their fiduciary responsibility for this particular area but we're not going to try to figure it out for the entirety of again Walmart or the Department of Defense or some other large organizations or even a small one. Five data stewards is not a lot for many organizations in there and our need for stewardship by the way will also increase and decrease depending on how much data debt that we run into in order to do that. So I'm kind of dodging the thing of the racing matrix in there. What it is is that you're certainly going to have some people who are going to be responsible certainly some people are going to be consulted, but you don't want to sit down and plot this out for everybody what you say is we're going to do this in the context of a project. The data stewardship project of data improvement project is the way many organizations call it the army has labeled it dip just so that you know and everybody at least in data in the army understands what a data improvement project is. And for that context we're going to then specify the racing matrix for the people who are involved in that and you may want to take a look at this quadrant diagram that I showed here as well to look at how the responsibilities would come to bear on that. So this is the diagram here that I showed on this page, where you've got leadership, participants, others, and of course the stewards, you can put a racing matrix over top of this for a particular project as a way of getting started getting practiced within that area. I hope that answers your question certainly is a good one. I love your answer to Peter. Because I've been on the on the side of trying to boil the ocean early on in my career, and you just get stuck in this documentation loop that leads nowhere and you provide no value to anybody. So really breaking it out makes a lot of sense and doing things in manageable manageable chunks makes a lot of sense. Absolutely. Another question, but different. We seem to struggle with differences between data owner data steward and data custodian. We also struggle on the difference between data custodian and it folks like product owners service owners do you have a simple way to describe those types of roles. And actually the previous question can apply a tremendous framework in there if you do define those roles owner steward and custodian, then those are fine roles. But the racing matrix is a great way of bringing clarity to that and again if you're not familiar with our ACI there's a great Wikipedia article on it that just describes it. If anybody in the organization and say look if we're making decisions, some people are going to be responsible some people are going to be accountable some people are going to be involved and some people are going to be consulted. And once you understand what those roles are then it's absolutely important in order to stick with those roles and make sure that there's no hippos highest paid important person at the table. And they can override these, these roles that we get comfortable within them. Again, great question. Thank you for it. Somebody asked my favorite question in the q&a so I'm going to skip a bit here. Our company man. I know I know. I just love the way this is worded it just brings back so many memories. Our company has spent a significant amount of money putting data governance in place. However, they made it a facet of it. As a result, the business isn't a champion of the work and are to displaced from it, and the data governance areas simply focused on data quality, and even then it only pertains to data warehouse initiatives. How do you shift the culture correct the alignment and move the needle in a positive direction. What a great question. It is somewhat typical of the way it has worked. And again, I'll put on my professor cap again here and say, if we've only taught people about data. Let me rephrase that the only thing we have taught most people who have paid us to teach them about it careers is how to build a new database. Should we be at all surprised that the main problem that we have out there in the world today is too many databases. I think it's a perfectly logical outcome and I think we have to ask some very deep questions and by the way the academic world is not responding faster what we're doing now. When the big data movement came on what happened in the academic world is that most of them turned around and took their old decision science and statistic groups and said hey we could relabel that data science and make a lot more money. And it has produced some results in an area so the answer to your question in particular is absolutely celebrate what they have managed to accomplish. That is, first and foremost, if you are seeing significant improvements in quality and aspects of your data warehousing that are provided by it. Fantastic. Celebrate them show what those return on investments are because somebody pretty soon is going to ask you. Okay, that's great. How can we do more. And when they ask you to do more you can go back into it and say hey, do you guys want to continue to do this role there is no I have seen a number of successful initiatives that are performed by data stewards who are largely focused and largely out of the it office. No problem at all with that. But if you want to get beyond that you need other people from other parts of the organization looking in from the outside and saying hey we'd like to do that. And of course they're going to say well we've got a lot going on right now because I'm pretty sure they haven't fixed all of your data warehouse quality problems that are in existence at the moment. So that's where you can start to say hey look there's demand does it want to run this initiative it absolutely can. But in the past it has not been the area where most organizations have achieved most of their success. I have a very small less than 100 page book. Back to the last page where I flash it up at the bookstore that is really got, I think the best case for why it is not traditionally successful in an IT organization and let me just dive into a primary reason. It for years and years has been critiqued from the outside for things going too long, not delivering enough costing too much and presenting risk around those areas. And IT has in response developed high expertise in projects, and that's good. It should be run as a largely project organization, but data does not conform to these project sizes again I had to slide on it that the data is not a project in there. That's really where things are going to come out IT is going to say oh okay we can do this give us a budget let's make a project for it and pretty soon you'll realize that outside of the data warehouse data quality is much much bigger than any specific individual project. Again, you may need a consultant or two to come in and yell at management and give the message that you've been clearly trying to give internally to this but it should be in this case phrased as wow great job it I'm so glad you moved in that direction. Let's see what else can be done and I'm pretty sure it will reach some limits pretty quickly, which will help you to branch out outside of it and realize that the data steward organization should really be a reporting up to a data leader and that data should probably not be reporting up to a CIO that is largely project focused again one third of data leadership right now does report to it we don't have good data on whether they are achieving success or not. But it is still possible to do this so don't despair but at the same time try and say hey you're doing great let's see if we can do even more together. I love that answer so much Peter and and I have a little bit of lived experience here as a as a data governance manager nestled within it a couple of times in my history, and it really. I had that that epiphany that it can be just so viewed as like an it problem or an it endeavor oh it wants us to do this it wants us to follow these processes. But I took it upon myself to go and be a champion of data governance with the leaders outside of it. So drinking a lot of coffee with a lot of directors and senior directors and AVPs and and at academic institutions deans and vice deans to really have that conversation about data, but mostly around listening as to what their struggles are with data and and enabling them and and exciting them about being these high level data steward roles. Exactly exactly what you're highlighting yeah and I agree with you 100% on this if again I'll just switch back to one side in particular that has just sort of the knowledge skills and abilities for a data steward. Again, you emphasized it we used to call these soft skills we've made a new word for them they're now called transferable skills, which is a much better word for it, but these are skills that will help us in multiple different areas. And so certainly understanding database design principles is going to be important but if you can't communicate technical information effectively and efficiently. And you've probably all seen it there's somebody in it that wants to come around explain things to you and you know you, you just can't stand the conversations because they immediately dive into areas that are way beyond where most portal folks dwell in terms of that and having lived at market it really is a badge of success and that's something I would look for. If I were hiring data stewards the lead on that should be somebody like Mark who has been there and learn the lessons around be careful Mark somebody will hire you away from your current role. So getting mad at us at that point. Again, great question. Thank you. So many fantastic questions. What is the role of data domains and sub domains in your stewardship model. So, given domains, most people use two different definitions the domains are the allowable values by database that's not what we're talking about most people talk about domains as being subject areas within there and once again, you may not even correctly identify them in the first place which goes back to, I think a comment that you made Mark where somebody comes in and says we're going to boil the ocean here right we're going to fix all of our data quality problems we're going to fix all of our data problems, pick one domain, pick a small subset and say look, we're going to get much better results from fewer people working more full time on a concentrated effort than we are going to a point and again I'm going to go back to that one that was the model that I gave me just a quick second to pull that slide up and just imagine trying to explain this to again a governor of a state or some other interested and involved party who's trying to do this where's my strategy stuff here there we go. This is a very complex data governance environment and what they had done in this case was they had said to all of the various agencies in here, you need to develop, you know, expertise and people that can come in and do all of this work. And the next governor that came in looked at this and said, huh. And it was gone. You know, just boom, vanished. So, so absolutely use the domains as a way of scoping and confining efforts and focusing efforts in there, definitely a good way to do it. Again, strategy is going to come up to one of four things, faster, better, cheaper, or less risky, and pick one right can't do all of them simultaneously. Awesome. And an IT strategy and data strategy really deliver breakthrough results. If the organization is not first developed robust data application and technology architectures. That's a hypothetical question I have seen it done but it's rarer than we would like it to be. So, again, the question here is referring to my. What am I trying to say the, the, the role of Morgan Freeman here, right. This is wrong. Give me a quick second here. Oh, I got a technical problem here. All right, hang on. I'm going to jump in there. Give me a quick second. There you go. You can all see my desktop there and go. IT strategy. Yes. So, yeah, many organizations still believe in this. This is another reason to make sure the data comes out of it because it strategy should and is typically correctly focused around projects in order to do this. But if we look at it from this perspective, you'll see a little bit better. These two should interact and I've seen very many groups take a data strategy. And another topic for another, another webinar market we can do at some point, but and the data strategy actually helps create savings in it, not just the 2040% meant spent munging the data that's in there, but also we can eliminate entire classes of operation. When we understand the data better because what you look at for systems perspective remember it systems take data in and put data back out. And if we turn out, we could follow these things through in a number of different flavors and things. I've got one organization right now that's looking at a reconciliation process, and they have 20 access databases that are involved. Well, nothing wrong with 20 access databases aside from the fact that Microsoft has set and set up the product in there, which does not bode well for using it in production. I'm barely sure that at the end of this exercise will find out that some of those access databases were redundant. They did the same thing twice. In fact, you may be processing the data multiple times. Just because people feel it safer to run it through this system to make sure it passes certain quality control checks in order to do this. So again, great question. Thank you for allowing me to go back and reemphasize this particular piece. Data strategy should support the organizational strategy, not the IT strategy. The IT strategy should also support the organizational strategy and clearly some coordination needs to occur in order to optimize that process. But having the data strategy subordinate to IT strategy is not a winning solution. We've got another favorite question of mine show up in the Q&A panel. Our company has data that is stored across many different data stores and systems. Let's say we have 15 contributed data sets that we use in our products. How should we approach assigning data custodians if not every engineer has access knowledge of all the systems where the data goes through. Should we have a have data custodians by each system? I'm going to make an assumption that you're using the word custodian synonymous with steward. If not, I'm not sure what you mean and you should be welcome to drop it in the chat mark will pick it up and can correct me, but that but yes. So, if you're looking and saying, okay, we're going to have a data steward responsible for each of these systems. And you have 15 of them. How many combinations of things have to occur? It's 15 times N times 2 minus 1, right? So it's a large number of interactions that need to occur. On the other hand, the idea that one individual can be cognizant of the data that's in all those databases is probably equally unrealistic. So some approach should be taken and I'll give you an example from a friend of mine who recently reported this out to my class, which was a wonderful set of results. This organization had tens of millions of parts, literally tens of millions of parts. So you can imagine there were lots of databases that were involved in it. And how can you be good at doing pricing correctly for tens of millions of parts? Let's just be realistic about it. It's not going to happen well. So when they took a step back, did a holistic approach and the data stewards were able to come back and recommend some general things. One of them was a reduction in the number of parts because of course, as you might imagine, there was a lot of duplication in that tens of millions of numbers. But also, deeply important, they were able to show that they were able to increase pricing on this in a very nice way. Now, I'm not saying we should do this to raise prices. We're all worried about inflation and things like that, but increasing revenue is generally something that management does not complain about so that you can go through and look at optimizing these things in a vastly different fashion. So rather than saying 15 stewards or one steward, I think the answer is probably somewhere in between and perhaps using some of that domain dividing we were talking about before or maybe chopping it up into replacement parts versus new parts or some other division around that will probably be much more effective than trying to get 15 data stewards to talk to each other about the contents of their 15 different databases all the way around and get a great question. It's a really practical sort of thing because it is the reality that we're faced with our existing environments. And while we didn't build them, we certainly have inherited them. I just love thinking about multiple different silos for the same type of information, especially when you're talking about customers in multiple different silos. Your answer in specifically Peter reminds me of a case study from Lego actually when they were having financial difficulties in the early 2000s. They had so many different product lines that didn't share very many brick types and they were having issues with being able to efficiently produce different kits because there wasn't a lot of shared bricks. So there were a lot of unique bricks and so they weren't making a lot of money per kits sold, which is interesting. You've got different entire different product groups that could collaborate and have similar bricks, but that's just not true anymore. Mark, if you have a reference that case study, I'd love to learn more about it. This is one of the reasons. I'll have to dig it up. Yeah, I'll have to dig it up. Yeah, that was a real fun one. I'll have to see if I can find it. Well, Lego is well known for their production capability. I have a whole series of slides where I talk about how effective Lego has to be in its production in order to do this. And you're saying but at a higher level than manufacturing, which is product planning, they were learning still lessons and that's good. All of us can learn these lessons still. If I can find it, it was right around the time that they tried to launch a video game when the massively multiplayer online games were super popular in the early 2000s. You mean they aren't? Yeah. Last time I looked the major source of revenue for a couple of those companies was MMOs. Yeah, well, it's true now, but it definitely wasn't then. We have a couple more minutes. Here's a good question. My organization does not do a good job of proactively planning for data needs. I spent about 70% of my time managing what we call data integrity products designed to retrofit the data lifecycle better to better meet business needs. Am I doing a disservice to our governance program by calling these projects? No, good, good question. So by the time you've broken it down into actions that absolutely can become useful and it's particularly if that's the language that your organization speaks. But the overall approach to data cannot be taken on as a project because people understand that projects, one of the characteristic defining characteristics of a project is it has a beginning and an end. And if you're going to tell me that you have improved data quality to the point where you don't need to have any stewardship involvement or anything else in here. You know, I've got a bridge in Brooklyn that I want to sell you to it's an ongoing process. It may require more or less investment during cyclical times. I can tell you once again, many organizations shut down their data quality improvement projects around October of each year because they don't want to break anything going into the holiday selling system. So there are good reasons for for that but let's be real careful and keep in mind the overall goal, which is how are we going to take these data assets and help the organization achieve its strategy. And if we can keep focused on that the rest of it follows. Well, like what would you call that the role that that person is playing managing those data integrity projects. Well, if there's nobody else first of all assuming that position and thank you for letting me walk into this one Mark because you know it's one of my favorites. I urge data leaders such as the question or to take and make a sign conforming with organizational pieces and say I am responsible for all the data in this organization and put it on your desk and start signing all your emails that way. And sooner or later somebody's going to come around and say hey Mark, who told you you could do that in which case you take the sign off your desk and you hand it to the individual and say. Okay, if not me then whom right and by the way it gets into a here's your sign joke so you can go out and Google that one in order to read the rest of it but it's absolutely true. So, so you are probably doing some aspects of data leadership. And is it possible in your organization that it is now mature enough that you have multiple levels of data leadership in there. Fantastic. That's wonderful. We've seen assistant data CDOs we've seen people called. We really like the term chief data officer, not so much that it doesn't describe what we're doing, but that there are so many CD, there are so many data leaders and so many different types of leadership out there that it is a big challenge for everybody else to understand I just like to say the top data job, and you may have a top data job but you're probably going to grow out of having just one person in touch to that top data job to having a data leadership team, and then we can start to really do things. Yes, I love that question to some great answer again thanks for the question great one. And this will be the last question we answer and I apologize for not getting to every question here because there were so many good ones. You can want to point a data steward without established roles and responsibilities and a strong data governance framework in place. Is it correct to say that data governance is responsible for strategy and the data steward for operationalizing it. That's certainly part of their jobs, and a large primary focus if you will the North Star for which things are going but oftentimes when we go into these organizations and work with them. And mark you've probably seen this as well the strategy sucks. You know, I shouldn't use vernacular like that in order to do it but some of the strategy is, you know, oh, let's do this. You know, pay for the good of whatever right now. No, no, it's got to be specific. Again, I like the smart framework it's a. You can Google that real real easily to see the components of that but it is a very important aspect of that to understand that while it's tough to get things started, you already have data governance it's just not performing well. I get this question like can you come here and help us develop a data architecture. The only time you develop a data architectures when you're adding new capabilities to a system we're creating a new system from scratch. Otherwise you're evolving and existing data architecture and somebody is in charge of early seen as the leader of the data group or somebody is gotten the title of some sort that's in there so keep keep those things in mind work within those existing frameworks. So that you can get to really good results in terms of helping out. Right. Well, thank you Peter for this great presentation and question and answer period but I'm afraid that's all we have time for today just to remind everyone we will be posting the recorded webinar and slides to data diversity.net within a couple business days. And we will send out a follow up email to let you know the links and other requested information. Thank you again for attending today's webinar and I hope everyone has a wonderful day. Thank you. Don't forget we want to see everybody at EW to. Yes, yeah, and if you're coming to EDW stop by the stop by and say hi will will both be meandering around. Bye everybody thanks so much Mark good to talk to you as always. See you later.