 Hello and welcome, my name is Shannon Kemp and I'm the Chief Digital Manager for Data Diversity. We'd like to thank you for joining today's Data Diversity Webinar, Getting Started with Data Stewardship. It is the latest installment in a monthly series called Data Ed Online with Dr. Peter Aiken, brought to you in partnership with Data Blueprint. Just a couple of points to get us started, due to the large number of people that attend these sessions, you will be muted during the webinar. For questions, we will be collecting them via the Q&A in the bottom right hand corner of your screen. Or if you'd like to tweet, we encourage you to share highlights or questions via Twitter using hashtag data ed. And if you'd like to chat with us or with each other, we certainly encourage you to do so. Just click the chat icon in the bottom middle of your screen for that feature. And to continue the conversation and networking after the webinar, just go to community.dativersity.net. And to answer the most commonly asked questions, as always, we will send a follow-up email to all registrants within two business days, containing links to the slides. And yes, we are recording and will likewise send a link of the recording of the session, as well as any additional information requested throughout the webinar. Now let me introduce to you our speaker for today, Dr. Peter Akin. Peter is an internationally recognized data management thought leader. Many of you already know him or have seen him at conferences worldwide. We were just talking about the data architecture summit in Chicago coming up. They know be up. He has more than 30 years of experience and has received many awards for his outstanding contributions to the profession. Peter is also the founding director of Data Blueprint. He has written dozens of books, dozens of articles and 11 books. And the most recent is your data strategy. Peter is experienced with more than 500 data management practices in 20 countries and consistently named as a top data management expert. Some of the most important and largest organizations in the world have sought out his and Data Blueprint's expertise. Peter has spent multi-year immersions with groups as diverse as the U.S. Department of Defense, Deutsche Bank, Nokia, Wells Fargo, the Commonwealth of Virginia, and Walmart. And with that, let me show everything over to Peter to get today's webinar started. Hello and welcome. And welcome to you, Shannon. Thank you again for hosting us here. And good to talk with everybody. So the topic today is stewardship. And while there's been a fair amount done with it, we haven't really codified the knowledge around that. So we're going to sort of approach that topic today. Talk about why stewardship is needed. And to do that, you need to understand some contextual definitions and particularly how the role of architecture works in organizations. We'll talk about the confusion that abounds around here because data has not quite found a home for itself yet and kind of needs to do that. A little bit around educational focus, not to beat up on anybody in education. We all try our best in education. I'm a tenured professor, so I don't want to bash us, but we are doing some things that could be improved in that area, a little bit on the role of strategy. Then we get to the how one does data stewardship. And the real question is, you know, what is your role within the definitions of data governance in your organization? And that can vary among organizations, but it's important to define it. So I'd like to start out with sort of a fire station model and then talk about taking reactive or proactive approaches and maybe looking more to shifting from a reactive to a proactive approach. And then the question is, when do you get involved? Well, there are different cadences, rhythms between data and IT. And so we do have to figure out a different way of working within them in that we need a different structural approach in there. We need simplicity, and there are some foundational prerequisites that we have to get to. That should be about an hour. And we'll get to the top and talk a little Q&A, which is the real fun part of that. Hold this. So let's get started with the why. And I found this out there, and it was kind of an interesting thing. I don't know how many of you are data stewards now, but it's kind of nice when somebody says the one person at work you can't possibly live without. I don't know that I, you know, necessarily subscribe to that, but it certainly is a nice thought around the process. What a steward should be doing is helping facilitate somebody's of work. Now we work in work groups, and we'll approach those work groups in just a minute or so. But let's look at, you know, first of all, a little bit of complexity around here. Here's a group that's formalized role of stewards in their organization, done a very good job of it in terms of defining the roles and what types of things they're responsible for, although they take examples. And you want to be careful with examples, because then somebody says, oh, I thought it was mine, or I thought it was mine, and of course it belongs to everybody. So we don't want to say the word mine or own or anything around all of these topics in here. So let's actually go to the basics and start off with, you know, definitions of stewards. So a steward is a person who looks after passengers and brings some meals. That sounds great. And I remember when the word was first coming out, we had colleagues in Europe that would say, yes, I expect you guys to show up with a martini tray and a napkin over your arm looking proper, right? It's an official. It's somebody employed to manage something for somebody else, steward of a state, for example. And so stewarding is managing or looking after. And then a data steward is managing data sets on behalf of others. The key is to represent a balance between stakeholders, as well as the enterprise perspective, and frankly have dedicated time. Now, I get into a lot of arguments with organizations who like to say, well, really what we'd like you guys to do is do everybody do 10%. If you get 10 people to do 10%, I'd rather have one full-time person than 10 people part-time. It's not that they can't and haven't done very good jobs, but it'll be far easier to develop the organizational capability if I have somebody who's full-time dedicated to that, because they will build the trust, the belief in the organization that somebody actually understands this, and we will talk a little bit about the fiduciary aspects of it as well. Now, one of the interesting topics is a great book, a colleague of mine, David Plotkin, has written a really nice book out there. And he describes all of these definitions here, which are really good, a business data steward, a technical project, a domain, an operational, a metadata legacy data steward, all of which are appropriate roles. But what I'd urge you, first starters, is don't overcomplicate it. I've seen too many organizations go down a rabbit hole where they sit down and try to pre-determine how things are going to be going forward. And of course, if you have this many steward types in your organization, you need an auditor, because we can't have everybody just running around doing things without somebody putting some controls in place, and then we need a manager to manage them on top of this. And as I said before, it's a very good book. I highly recommend it. David did a great job here. The real key is this should be aspirational, something that we head towards as opposed to starting off here, because this is complicated for us, and we do it. And just think how it appears to others who may not necessarily have an interest in it. So let's take it a little further. Steward is somebody who uses the organizational data assets in support of organizational missions. The key is to take those data assets and use them in the supportive strategy in a way that really works. So what do the stewards do in our organizations? They try to generally improve the organization's data assets and value and evangelize for and try to change hearts and minds around the process and a little bit of controls in the process. Now they don't have very much and lots of resources in some organizations, so you have to prove yourself. And this means outside of regulatory other mandated types of environments, it really does mean that you're going to have to be kind of entrepreneurial about doing that particular process. Excuse me. One of Gwen Thomas's wonderful cartoons here. I might just somebody need to be diplomatic to be a data steward, right? Well, especially if you're the first one. It can be an invigorating process. Here's our dim-bock wheel. Hopefully you're not seeing it for the first time. Of course it's version one, version two looks like this. The difference is we have added a wedge on the side here for data integration and operability. And really data governance and integration are areas that data stewards can play very strongly in. Now let's do a high-level definition of data governance. Once again, there's lots of definitions, but my favorite is this one. Managing data with guidance. The nice way this definition works is that you can ask the opposite question. Would you want your soul non-depletable, non-degrading, durable asset managed without guidance? And those people say, nah, I don't think so. I'd like to manage with guidance. So let's look at the governance and architecture aspects of this. First of all, we have corporate governance. And again, this is sort of a general motherhood and apple pie, or at least it was until August. And all of a sudden a number of prominent CEOs came out and said, you know, it's not just all about maximizing shockholder wealth. We have yet to see action on this, but the fact that it's even being discussed is a really wonderful thing and has a bearing on corporate guidance, which can have a bearing directly, of course, on the types of data that were required to maintain. You probably also heard the controversy recently about the SAT dropping their plans for a adversity score on the SATs. Again, very interesting data problems there. Will these fall under corporate guidance? And corporate guidance then we also, of course, have IT guidance. And what is IT focused on? Well, they'd like to be aligned with the business strategy and provide measurable results and key questions. There are five areas that they recommend, strategic alignment, value delivery, resource management, risk management, and performance measures. So great stuff. We're very pleased with all of this. Let's shift our topic a little bit and talk about architecture. And most people haven't done a lot with architecture, but it's really an excellent craft for data people to be able to work with. And most of you are actually quite good at it already, just may not have done it. We're talking about abstracting things to the level of things, the function of those things and how those things interact. I'll turn the volume down there. That's a little bit too loud. And then why that's important? Again, things, the function of those things and how those things interact. Why is that important? Because all organizations have architectures. The question is, has it been documented? If it hasn't been documented, it's hard to be widely understood and therefore useful. So once we can document and understand this, then we can start to become useful in employing our data assets in support of organizational strategy. And of course, we're really talking about data all the way around, so I'll just insert the word data and all of that, and the same thing is still true. Let's take a quick look at some organizational architectures just to give you some practice with it. These are sort of funny, but sort of true. Amazon, traditional structure, Google, Tema3. We've changed around this as old, obviously. Facebook, do we really have a structure? Microsoft, eliminate their own products. Apple, everything revolves around, run individual, and Oracle buys another company and subsumes it into the beast. So these are ways in which you use architectures and all of them have sort of interesting pieces, but organizations manage architectures in general. Processes, systems, business, security, technical. Of course, data and information. Now, if they consider you just doing technical committees and they don't understand what the technical committees do, this is a problem leading, of course, to the lack of understanding, because what we're trying to create here with our efforts as stewards is better clarity around data. Data measurements, data utility, data sources, all things data. And so stewards are going to become experts in these areas. Now, let me take a quick definition of information architecture and you'll see the role stewards play. We have some very good definitions that are out there. My favorite, of course, is mine, which is common vocabulary. Imagine you're trying to explain to an executive on the way up in an elevator what an enterprise architecture is useful for for that individual's organization. And if you describe models and diagrams and things, yes, okay, if you say I provide a common vocabulary, they go, yeah, I have a lot of problems with people not speaking the same language in my area and it puts sand in the gears, slows things down. Of course, there's always been confusion because IT has always thought that data is a business problem if they can connect to the server. My job is done is a quote we've all heard from IT many times. On the other hand, business clearly thinks IT is managing the data, why else would, where else would it be taken care of? So data's fallen into this enormous chasm and we need to repair that partnership between IT and the business. And that's really the role also of data stewards of my goodness, you didn't know what you were getting into here. A quick Dilbert just to divert us. The committee decided that the file naming convention will start with the date in order of month, year and day. Of course, those of you that are following carefully know that we've already screwed up by not putting in a standard way of describing date time, which means it would be difficult to access the future. Next panel, and then a space in the temperature of the airport, the hat size of the nearest squirrel. To be perfectly honest, it was a long meeting and we probably didn't do our best work toward the end of the meeting. Knowledge workers, the people who make up our work groups, the people with whom we all work, they're not taught about data. Again, my definition of a knowledge worker is somebody who works with data. 100% of them should. Worse still, we for decades have people, taught people one thing about data as part of their IT curriculum, which is how to build a new database. If there is a skill we do not need any more of on planet Earth, it is how to build a new database. We could use some skills around how to work with existing databases, so that would be great. So, question is what impressions, not just on the people who learned this, but also the people who went through and became managers of other people, they get the idea that we don't need data people on these projects because we're not building a new database. I'm doing ERP, that's not a new database, is it? Or migrating things, that's not a new database. So I don't need this data people in here and literally they have fallen in terms of what they used to do for the organization and what they're currently asked to do. The result of all this is a bunch of really smart people who just are not knowledgeable about data, and they make bad data decisions, which relapse in poor treatment of data assets and poor quality data, as well as poor organizational outcomes. So how are we gonna fix this? And most importantly, it's not so much how are we gonna fix this, but what approach are we gonna take this? So let's talk about the role of strategy because everybody wants to do strategy now. You may be saying, why are data stewards involved in strategy? Well, you're a key player in this role. Data strategy provides you focus. Over time, we've used the word strategy more and more, but that's mainly because the business groups have picked up on it and started to use it. There's lots of long and complicated definitions of strategy and parts of a document and templates that you can download off the internet and things. I prefer things simple. So Henry Minsberg has a definition I like, which is a pattern in a stream of decisions. Why is that important? Again, remember a few minutes back, I was describing to you all the way in which you need to present information to people in ways they can understand it here. We also need to do this to the rest of our data community, which means we need to make things simple for them. Let's give three examples of this. The third example is a complex strategy. So first example is Walmart's former business strategy. You guys can see where this is going. Every day at low price. Well, I'm not telling you anything you didn't know and you know why? Because Walmart wants you to understand that. Every person who did business with them who worked for them, when a family member that was associated with them understood and more importantly, this became the default mode of operating. It became part of the DNA, the corporate culture of the organization. If you make a decision, you make it in favor of the customer receiving low price, you made the right decision. That's comforting for people to work with them. Walmart uses culture very, very well to its advantage in the marketplace here. Second definition strategy, excuse me, second example strategy. Wayne Gretzky, great Canadian hockey player. He skates to where he thinks the puck will be. You're chasing a hook around, which is a hard piece of plastic that weighs a little bit, and you're hitting it with large sticks. It's gonna move very fast. And if you chase it, you'll always play catch up. So he doesn't play catch up. He says, I'm going to change the game by going to where I think it will be. And if I am where I think it will be, and it comes to that place, I will be in position to become the greatest scorer in history at the time. Third example strategy, Napoleon at Waterloo. He's facing two armies. The army in red is the British, the army in black is the Prussians, and they're bigger in combination than Napoleon's French corps. So question, how do you defeat the competition when their forces are bigger than mine? The answer is to fight and conquer. Let's look specifically around this. First thing that this strategy is still taught in military history books today because it is a brilliant strategy. It didn't work, but that's a different issue. Blinds of supply were one of the things that you look at when you look at an army because if you push the army back, they're more likely to run towards their food than away from it. I don't mean that just statistically, but there's obviously a biological function that works there as well. Same thing here. So the Prussians were provisioned out of Leeds and the British were provisioned out of Ostend. You can see they are miles apart in an opposite direction. So Napoleon's theory was divide, and let's do that. I just keep, oh, I'm sorry, I went too fast. Let's divide, first of all. I'll just go back to that slide. And the key with divide is that you take your forces and mass them very strongly at one particular point and you hit really hard, okay? Then, point two, you then conquer and the conquer has to be sequential. The first one is we're gonna turn to the right and defeat the Prussians and then we're gonna turn to the left. So let's go over the strategy. I said before it's complex. Hit both armies really hard at just the right spot because if I'm too far to the left or too far to the right, my strategy of dividing them will not work. Second, component of the strategy, turn to the right and defeat the Prussians. If half of us turn to the right and half of us turn to the left, we will not survive. And then when we've defeated the Prussians, turn to the left and defeat the British. Oh, and by the way, do this while somebody is shooting at you. That's a hard thing to get a lot of people to do, even if they're trained to do this, much less your organization that knows nothing about data. So keep things at this level. A pattern in a stream of decisions is a great way to think about this. There's another whole talk we do later on this year on strategy on this, but just sort of a word to wrap it up. On strategy itself, if you've got strategy and somebody says, I've got to go find it and read it, it's probably not gonna get used. So the famous Eisenhower quote on this is that, well, he thought that the plans were nothing. The planning process was, in fact, everything. And that's a very good sort of guidance to take us with because the state of strategy looks at the problem of saying I've got all this organizational data and I've got this limited amount of people, full-time, part-time, whatever they happen to be. That's going to be a component of my solution. We're also going to use some technologies, absolutely appropriate in this case. I'm showing the fulcrum and the lever here. Those two pieces are both necessary while you could move the organizational data without the purple thing on the left, the fulcrum. It would be more efficient to do it this way than not, but it can be done. These are engineering calculations. We get the right people and understand. We do a process that'll work out as well. We've also got to work on some aspect of this which is called data rot. And data rot is something that's problematic, but if you've got data that's redundant, obsolete, or trivial, reducing the amount of it will make you easier to focus on the remainder and do good work on that particular piece. So let's talk a little bit. The terminology is not done widely known. This is something we're going to be educating ourselves and our colleagues about because we haven't been doing it for 6,000 years the way some of the accounting practices have actually been working on. We don't have agreed upon definitions. And there's no point in arguing about them because time will tell more so than prognosticating about what will likely happen, what actually does. It's clear that data governance is personal and it's also clear therefore that data stewardship is also going to be similarly personal. It has, however, become de facto standard to have people in charge of the data and that seemed to be an argument that was pretty easy to make. Stewards work very effectively with architectural components. They help us do a number of different things and we'll talk about this in just a second. The strategy focus is Steward leveraging activities. While there's lots of things we could do, which are the things that if we can do them correctly and now will actually result in what we need to have. So that's sort of our first section on why. Let's now talk about how Stewards actually work. First of all, most of you understand that there's a version of this somewhere in your office which says implicitly or explicitly, our process returning data and information is overly dependent on human beings. Wetware, that's the stuff between our ears. Knowledge workers, informal communications, it's described as the weakest link in some organizations on this. And the simple fact of it is that organizations don't know what data they have. They don't know where it's at and they don't know where the knowledge workers do with it while presenting additional sources of risks. Also include efficiency components on this. Work groups are what get done. And data stewardship happens currently at the work group level pretty well. There is some issue with, if everybody's learning on their own, you're pretty sure they're not using a standard and if one standard is better than another then there's no opportunity to gain efficiencies in doing things the same way. But nevertheless, we can move towards that and that's certainly a thing to think about in there because right now people are doing informal practices. Only one in 10 people that are using Excel know that there's a capability for doing what you're doing in Excel perfectly every time, automating that process called macros, one in 10 people. Just give you an idea of the opportunity we have for improving efficiencies in that area. But the real value of all this comes from making cross work group connections work more smoothly because when we're starting to go back and forth across things, that's where you really realize that the data chaff becomes sand and it prevents smooth interoperation and exchange or it slows them down. Most organizations experience kind of a death by a thousand cuts, but it's been difficult for them to account for them. They just know it's not working the way it should. And that's because organizations and individuals lack the knowledge and the skills in those areas. I've already mentioned this for starters. First thing, the general thing that stewards are doing is that they're helping people to understand that better organized data increases in value. If you have trouble illustrating that point to people, take a book and remove the spine. So it's just a bunch of pages and then mix the pages up and hand them to somebody and ask them to enjoy their read. They will not, even the fact that you're giving them actually has page numbers all the way across it, which is kind of cool, but not as good as it could be if you hadn't taken the spine off and messed up the paper. Yes, better organized data increases in value. If nothing else, because your knowledge workers will be able to find what they're trying to find faster. And if they're finding it faster, that means your organization can spend time doing the things it's good at rather than on routine description time. With Tom Redman, he calls them hidden data factories. It's a good term. Poor data management practices are costing organizations a lot of time and money. And the only argument I get with this statistic is that it's not 80% for our organization. It's 90%. So what is ROT? I've already mentioned it wants redundant, obsolete or trivial data. And if the data is redundant, obsolete or trivial, why would you want to maintain it at all? My wife corrects me from time to time. It says it's actually riot because it's redundant, incomplete, trivial, obsolete, and she's correct. But I already put ROT out there, so we'll have to worry about that. The question is just like in advertising dollars, which 80% can I safely eliminate? By the way, I'm not advocating going out and eliminating 80% of your data with proper stewardship, but it is something that a good high goal to take a look at. So yes, redundant, obsolete, trivial gets in the way and the remainder makes it easier to focus our efforts as stewards on improving them by making them more understandable and also making descriptions about how they're used in the organization also understandable. There's lots of other things that go into it as well, but greater quality data gives us more engineering leverage as we're trying to do whatever it is our organization does, whether you're on profit, whether you're in the government or whether you're in the private sector, there's still a mission and that is important to take into consideration because integration, the process of our disparate working groups working together without these information architecture components is not possible. And maintenance of these components also then promotes greater reuse, which means that data sharing becomes exemplified by the ability to use information as a strategic asset. There's lots of examples of that that are really good. We won't go into a bunch of them here but you can certainly get lots of webcasts that do this and describe it in that sense. Data is the most powerful, underutilized, poorly managed organizational asset. It's the only resource that you have that is not depletable, not degrading, durable over time at the strategic level. Data assets really do win when you compare them to other data assets. And yet, people talk about them incorrectly. Data is not the new oil. Don't let people think about it that way. I don't mean correct people in public, but you don't think about oil as a reusable commodity. You instead think about it as something that you use once and don't use again. And that's not the way to think about it. It's not a production function. It's a cultivation function that we have to think of it as. A good way to think of it is just to change one letter in that previous statement and make it the new soil. The idea is that you plant things in it and you don't just throw seeds on the ground but you plant them in a prepared plot. And then you don't plant things on Monday and expect to eat them on Friday. It doesn't work, it takes time. On the other hand, data is usually sold as bacon which is really fast and silly and great but data does deserve its own strategy. It deserves attention that's on par with similar organizational assets and it deserves professional administration to make up for past neglect. Now let's look at data strategy in context in here. Organizational strategy is developed. Our purpose in data is to support organizational strategy. Governance is about that process and strategy is an essential component for data governance. What the data assets need to do right now, not forever, but for a defined period, let's say for the next 30 days our focus is going to be. This is the type of thing you need to do to address the challenges that you have in data. It, again, you've already seen it starts off generally with lots and lots of information out there that you have to winnow down to support and get more of but still what the data assets do to support the strategy and then the feedback from the governance group is how well is data supporting the strategy? Seems reasonable. In Peter's world, you also have domain and over IT projects and how that supports strategy but that's a different argument. Let's not get into it right here but fill out the picture with a couple more feedback loops and I would never show most people that big diagram and keep it kind of simple here in particular. When most people think about data strategy, they really need to be thinking about specific business goals. So ways that we as the data stewards of the organization can harness resources around us, including ourselves to support the attainment of business objectives by better using the data that's in there and our language in governance, which means the stewards will be metadata, so that they will talk and that metadata will continue in the conversation with stewards who will then talk back and forth as they're working through things. Now let me give you a little bit of guidance around the process of starting a stewardship organization. First thing to do is to understand the role of frameworks which is just a system of guiding ideas the way that you can organize the project data. Priorities, decision making, the way of assessing progress, these are all good things. Simple things that are kind of obvious in this example, don't put up the walls until the foundation inspection is passed, right? Or put on a roof as quickly as possible because winter is approaching and folks that are working on building a structure, whatever it is will be cold. Make it all dependent on continued funding so there's an obvious feedback loop so that people pay attention to what's going on here. I say framework because here's a data steward framework. Now this is not data steward, but you got to say it's a stewardship framework, not a data stewardship framework. And they talk about interesting qualities, personal mastery, a vision, mentoring, promoting and valuing diversity, coming up with shared vision for the organization, risk taking and experimentation, entrepreneurialism, vulnerability and maturity, again that thing in skin thing that's probably not a good attribute, they're raising awareness, delivering results. Not bad for a stewardship description, we won't see too many of them in the data area, but I'll give you another version of the same kind of thing, the stewards need to be thinking that there are specific data challenges that will help us use organizational data assets to better support the mission and strategy of the organization. And as we understand them, we will get them into some sort of strategic consideration because we can't do all of them. Some of them will address it some other time. Sorry, just got to put them in the bucket list and get to them, we can't do it all, can't do everything at once, we can't spread ourselves to anything, we've got to be concentrated and do a couple things well. So the stewardship engine as I call it has to do with, yeah, there's a component of regulation policy but there's also stewardship activities as a subset of that, which I've already said can be reactive and proactive, which means that we're going to try and provide value. The value can be monetary in some cases, it can be non-monetary in other cases. But we've got to be able to show it because otherwise a new group will come in and say, oh, I don't think we're gonna do that anymore and the initiative will lose all of its effort. And of course over time, we'd like that value to be seen greater and greater, which means part of that stewardship piece needs to be building good articulate business cases so that people know what good work that you're doing and understand how it fits in with the rest of the organization. Let me give you a little bit more context and talk about your data community in general. Yours may or may not look exactly like this one but I think the general components are recognizable to most groups. First of all, there is IT and they provide the bedrock, the foundation. I'll put it in a standard four by four matrix. On the left-hand side of the red line, that is vertical, the domain expertise will be less. They won't know as much, they'll need to learn more. On the right-hand side, the domain expertise is greater. The rules are more formally defined on the left and the rules are less formally defined on the right to put those things are true. Then on the horizontal axis, the bottom of the horizontal line, they're gonna encounter data that is governed more directly. Whereas on top, they're gonna less be direct encounters with it. Hit the wrong button and start it over. Sorry guys, give me a second, I'll catch right up. And our last set of encoding here is that below the line, more time is dedicated to this process and above the line, less price. Excuse me, less time is dedicated to it. So what have we got? Well, we've got leadership components. These are people that make decisions about data. And this is really their role. The problem is most of the time when they make these decisions, they don't realize that they are decisions about data. So that's part of our job is to educate them around that they are the stewards, group we're directly addressing today, these trustees of the data. And let's, again, let's not try to divide up all of our data amongst the three stewards that we have or 30 part-time people that we have. Let's instead focus on a couple of areas and get good with those areas and show that investing a full-time person in these three areas is quite well worth it from the organizational perspective. And that'll give the incentive for more. But because if we try to do everything and spread to thin, we will accomplish, of course, nothing. Working with us a lot are the participants, the experts, the data subject matter experts that are part of the fabric of how it uses. The gentleman with the SQL server under his desk that doesn't want IT to find out that this is how he's doing the logistics work that are working or somebody with a unauthorized Amazon account that is doing things. And then there's others. They're all data makers or consumers in one form or another. So that's our universe. And let's now talk about, in this case, what some can define as a leadership group. This may not be again for you, but some groups have said leaders and stewards will be part of the enterprise data group. Others include stewards and participants, that matter. But the role of this group, whatever you define it as, is to provide resources because we need to have some sort of programmatic sustainment. This cannot be done project by project. It's got to be programmatic. These data makers and consumers will provide data and feedback. Some of it will be directly to the leaders. Some of it will come through other folks. The leaders will make decisions and the stewards will figure out what action needs to be taken by everybody else. What changes need to be made such that we can better use our data and support a strategy for this particular example. Again, more feedback comes through those ideas. But the stewards should get some ideas and provide guidance to the leadership in there. A little bit of a brief tour over it, but hopefully that helps you understand. And really, probably the best thing to do is take a perspective. Maybe you're a data leader listening to this and trying to figure out how to stand up a steward group. Hopefully this gives you an idea of the way to do it. But let's talk about another idea that also is working in the data steward community. Yes, that's right. It's your local fire station. Now, everybody knows the typical role, which is the Dalmatian on the hood of the fire truck and rushing to save people's lives and put out fires and absolutely fantastic stuff. And then there's the downtime as well, but they don't always sit at restaurants or buy food or play pool or whatever. What they also do is a lot of education and it's a great model to watch. How the firefighters spend their time is a good way to think about it. I'm not suggesting that you go on 48 hour nonstop shifts. That's not what I'm suggesting, but the mode that they use to operate which is part reactive and part proactive. You already have some pictures in your mind. Yes, there are those ones that are telling me to replace the batteries in the fire alarms. So the ones that are telling me to move things around such that I'm not storing paint by the gas furnace in the garage. These are all good educational activities. So sometimes there's some things to do and we need to do them. And sometimes there are less things to do and we need to instead be working on a general education piece and balance between the two things. So our goal is to start to transform the tribal knowledge-based processes into asset leveraging components. To understand that stewards transform governance into strategy focused action. We're trying to do this. Can't be everything. Apply a framework to your tasks and get good at both reactive and proactive activities so that you can now start to incorporate your own direct leadership but also leadership around the organization such that they understand that you are a resource. Because remember where we started, your best friend at work is what the stewards should be. Know that you can't do everything. It's just impossible. There's been too much poor treatment of data over the years and too much redundancy to make it possible for anybody. And it's scary when you talk about people and say the time to correct these may take years and in some cases decades, but it's still worth the effort. So the key is to get focus correct. The wrong question is not how should we manage this data, all of it, I'm trying to get it right, but the right question is should we include this data item within the scope of our steward practices for the next round that we're trying to do? Because if you just do stuff and don't have a repeatable process, you can't measure things and improve things. Some of you will see where I'm going when you get to that. Let's keep going. Another wonderful way to think of it is the MacGyver model. If you haven't seen a MacGyver, I think you've on YouTube now, could fix anything with a piece of scotch tape and a paper clip and there's gonna be a bunch of that. And the best thing that you can do as stewards is become part of the steward community. There's a fantastic organization called the Data Governance Professionals Organization and the Data Versatility Community that's there. So that's sort of some of the how. Let's dive a little further into what I call the win. And it's not what most people think about. So for example, in organizations without stewards, again, you've got two strategies. Your stewards can improve organization operations or they can innovate and there's no problem. But let's make sure that we don't do either because that's not a good thing. Stewards do need to be actively involved in the focus of what they're trying to accomplish and improve things. And let's be absolutely frank and say that Walmart is expert at increasing organizational operational efficiencies. Absolutely fantastic. And Apple using creative opportunities, okay? Apple's the innovation. We'll pretend that Apple will allegedly innovate so we'll pretend Walmart is really good at it and both of them are. The key is, I want you to imagine taking the people who are in the area that is, oh, I don't try to do both. Sorry, that public stuff there, don't try to do both. But I want you to imagine taking the people who are in the V3 area. Oh boy, I really messed that one up. Get back. People who are in that V3 area and telling them to be cheap, right? Those are the innovators. Telling them to be cheap is not a good idea. And telling the people that are in the operational efficiency exit to be innovative is also not a good idea. They are not the same skill sets. So you need to be sure that you sequence the type of thing when you're working on. Don't let people try to get involved in doing too much too fast. Focus in on the increased efficiency and effectiveness and then use the savings from that to fund some of the innovation opportunities. I apologize for making that a really bad thing, but you can edit out my idiocy there and catch the rest of it. Let's look at another example. When you look at how data is done in most organizations, there's this data management piece that most people really don't understand. Not through their fault, we haven't educated them about it. Something happens is about what they get and what happens is data usually gets replicated or duplicated and then warehoused by put into the cloud and then you may put a bunch of marks together so that you can focus on specific subject areas. If this looks in-minute or data vault-like, it should absolutely appeal. But the problem is that the learning, the feedback loop that comes back from this, looking at the data, tends to go back to the ETL layer. And that's unfortunate because it should go back all the way to the data area. So again, as stewards, you need to have feet in both of these camps, the sort of BI world of data usage analytics and the data management practices end of it as well. Because what you learn from how analytics is done, it's kind of the wild, wild west in many organizations, can give you good feedback that you can use to improve better data feeds for them and reduce the amount of time and effort they spend munging the data because you've got the better leverage around all of that. Because, as I said before, data is not a project, it's a durable asset. That's got a very specific definition in accounting terms. It's an asset with a useful life for more than one year, which means reasonable project deadlines, in a project sense, maybe 90-day increments or two-week sprints or whatever you're doing, but the data evolution is measured in different terms. It just isn't possible to change that much back to good from the current state. Data evolves. It is generally not created. It is significantly more stable across the years for organizations than are the process controls that they use. There are a bunch of different ways of approaching this, but the best way to think about it is to interface with agile development by providing a programmatic level of support that says only when the data elements are met and fully understood and vetted and validated, does it make sense to use those data elements in any agile sprint? And if you're using an agile sprint and discover that the data requirements are poor or not understood or misinterpreted, it's generally time to move on to another piece of coding. There's plenty of more to do, but not put any more effort into that one because it will just result in a new smile pile of data, which frankly will keep us all employed, but isn't the best for our organization. The alternative of creating data silos is very problematic. Let me quickly run through a set of kind of new concepts from a stewardship perspective, but to think about it in these terms. First of all, the thoughts I'm gonna give you are wrong. According to George Box, all models are wrong, but some are useful and I fully subscribed his model, so hopefully these will be useful. Developed a website called The Data Doctrine, and if it looks suspiciously like the agile manifesto that we were just talking about, it's because the words up there on the screen are exactly that. We are uncovering better ways of developing IT systems by doing it and helping others do it. Through this work, we've come to value data programs, preceding software development data structures that precede stable code, shared data preceding completed software and data reuse, preceding repeatable code. And it's not that there's no value to the things on the right, but we value the things on the left more, which is exactly what the agile manifesto said, so hats off to them for doing a great job and I hope this is useful here because what really is happening is that there's a mismatch, and I alluded to it earlier by saying there was a cadence mismatch, impedance is another word that we use to do this, because IT is very good at building new stuff, but as we've already said, data evolves over time, it's a much different rhythm, and it needs to be separated from, it needs to be made external to, and it needs to precede systems development, lifecycle activities. If you do not, the only result is more small piles of disintegrated data, again, guaranteed employment for all of us, but we'd like to do better. So data management and systems development must be separated at sequence. It just does not work. The data doctrine is out there, if you're interested, we'd love to get in the dialogue with you on that, and I'm gonna close this little short section with an agile joke, wait, you're gonna perform surgery without putting me under? Says the patient, not realizing he had signed up for agile surgery, and the doctor says, yes, this is agile surgery, we need to ask you about your symptoms and complaints after we open you up. We'll also need to know what you want us to work on in the first round. As I mentioned before, most people have the strategy part of this incorrect. There's an organizational strategy, a data strategy, and an IT strategy, but they do not work in the way they are typically presented. This is simply wrong. So the right way to do it is that these need to be co-developed and that there needs to be coordination between the two and quite frankly, because data has these slower moving aspects of this, data actually needs to govern some aspects of what IT does in terms of timing and sequences. Again, I've mentioned several times the process of going through this. It is taken directly from Elehue Golderat's book, The Goal, and adapted for us here. Many of you remember Alex Rogo and his adventures at Unicale. If you don't, it's a wonderful book and if you don't feel like reading, it is a great book on tape. But it views management systems as being limited from goal achievement by a constraint. And that's what you do. You define a business problem, draw a circle around it and say we're gonna fix this goal. We're gonna practice our process of getting good at it as well as our capabilities. There's always some constraint and the focus is simply if there is a weak link in the chain, we need to find the weakest link and fix it because this architecture is all additive in nature. The idea is that we're, as stewards, making a better data governance sandwich. Basically, what is a data governance sandwich? Well, it starts out with varying levels and capabilities of data literacy and there's different levels of data supply because there's no one central directory or a data steward to help you go and find the stuff that you need. Instead, you're supposed to go and find Fred or whatever it is. And the more streamlined, the more automated, the more, because remember these are occurring hundreds of thousands of millions of times, daily, hourly in some instances, that we have to engineer them and make them fine. Remember, we're trying to eliminate hidden data factors and this makes all of our operations smoother and we need them all to work together because if we can't get them all to work together, then we have no ability. And this working together cannot happen without engineering and architecture in focus. It has to occur. I was on a tea farm in India a couple of years ago on a vacation and I found this wonderful quote that said quality engineering and architecture products do not happen accidentally. Of course, we know again, we're talking data stewardship here, so we're talking data engineering and architecture. And if these topics are not familiar to you, then we need to start getting smarter about the process and looking for opportunities to educate ourselves and find out what aspects. I'm not saying mathematics, that you're going to be computing anything, but understanding weak link in the chain types of things because overall in organizations, there is a very definite lack of things. There's a fundamental mismatch between a data program and an IT project. If you have trouble with this subject, find somebody in your organization who is PMP certified. It is an objective certification and they will be able to point you chapter in verse to the kinds of things that you need to be able to understand these two terms. Data programs need to exist as long as your HR program exists. Projects have a beginning and an end and data doesn't work that way. Fundamental mismatch. There are objective assessments that can be used to measure in advanced progress, which is the repeatable process that I was describing. You won't know it at first, but after you've done it three or four times, you'll go, oh, I got this. As scale increases, so does the dependency on these architecture and engineering concepts and harmonizing organizational IT and data strategies is absolutely key to being successful. Sequencing some aspects of this, of course, can be very, very helpful to the organization. See, there's a big issue. Most people sort of look at the world of IT business and data as data being the bat sign in the middle of everything. It's not a good way to think about it. And they kind of go, we'd like to get good at this and do more, right? This is sort of a vision statement, but they just don't realize that the way the world is is much more like this. And so your roles starting today are needed in an incredible, incredible, incredible way. So this is the when component of all of this. Again, what we've covered is the why, why somebody needs to have data stewards. There's a lack of good treatment of data and there's a lack of expertise about this. Obviously good places to come learn about this expertise are the various data bestiality offerings where we'll be appearing. You'll see the list of those coming up, but we'll all be gathering in Chicago in the next month to start the process of looking at the Data Architecture Summit. And how, how we as stewards interact as a relationship with governance, really the fire station model is a good one. If and if you don't like it, tell it what you don't like and it gives you something to describe what you do like. Oh, it's fire station with cute puppy dogs, right? I don't know, I'm making it up. And finally, the last component of this was the when. There's some sort of different cadence about this, which means there's a very strong need for a different structural approach. Can't go on as a project, but project approach. And that is what IT is good at, which is why a data steward should not report into IT. I've said over all the way around that you need simplicity in this because this simplicity allows more people to understand more quickly. If we put barriers up, they will not be able to understand and can't be helpful to us. These are foundational prerequisites to getting everything else up and running. So I'm gonna spend about 10 minutes here talking through some takeaways. And these are quite in depth, which is why I spend a little bit of time on this. First of all, the need for data stewardship is increasing. However, we believe very strongly that this type of education can be provided by high schools or perhaps even to your colleges, the community college level. There's all kinds of support. Guess what, veterans make fantastic data stewards because they understand the concept of stewardship inherently. And of course, the data volume is not growing any less. So what we don't have is, as everybody has said, a set of good practices around this that we can point to as a book. Maybe David Plotkin will write another book or one of the rest of you will. Gosh, his libel is next one. His next one is on governance. That's an update on that. Anyway, this is a new discipline. We haven't had it forever. So we are still learning and we do not know everything and trying to specify exactly what's gonna happen 10 years from now is simply not worth the effort. Instead, take a proactive learning approach, start doing some stuff better and then measure it and try to get better at getting better because you're gonna have to conform to existing constraints and there is no one best way. It's got to be instead delivered by a data strategy that complements the organizational strategy. People aren't gonna know what you do is useful unless you can point to data strategy and say, this caused us to do that, which caused the business to achieve something better and we've got to start practicing that. There are a lot of data strategy frameworks. We don't have time in this one to look at it, but if you just Google data strategy frameworks at images.google.com, you will see a bunch of them out there or there's several presentations of mine. I think past the DataVersion webinars that we've done are out there that have these. But anyway, there's a bunch of them and the idea is not for you to adopt any or all but just try them on and say, how would this work in our organization? IBM's data governance model may not give you as much as another organization's approach to it and this gives you the opportunity to try it and see what works conceptually. Data, stewards, direct data management efforts. Oh no, no, they just advise. No, who's gonna do it if you don't, especially if you don't have a program. They are the first most important component of your data program. They're visible. In the Commonwealth of Virginia here, we literally called them Commonwealth Data Stewards and gave them badges. It was a fantastic move. It helped us to make sure that everybody understood that data, steward, language is metadata, that we need to speak in specific controlled vocabulary items or more importantly, make sure that we start to turn them on and off, kind of like on the record, off the record kind of thing. Yes, this is a controlled word which means we all know what we're talking about in here. If that didn't make any sense at all, I'll ask a question at the end but that's an aspiration rather than a starting place for you. And finally, there's really good aspects of process improvement that can benefit data steward practices. People say, wait a minute, why should data stewards get involved in processes? So let's just take that as a quick side piece. If we're going to simply improve new controls, this is going to make the water that is coming into the lake of better quality. How long does it take to wash out all the old stuff, answer a long time? So we've got to do some other things and literally this is low hanging fruit in most organizations because most organizations spend 20 to 40% of their IT dollars focused specifically on IT waste. It's moving data, it's evolving data, it's improving data and it just doesn't need to happen but nobody even uses the techniques to understand this anymore. It's called a data flow diagram. It is not taught as part of the curriculum, that's not been taught as part of the curriculum for many, many years. So people look at this tool and go, wow, why didn't you show this to me earlier? At least I'm giving away my age and grumpiness in terms of sighing like that because I'm married to an accountant. And she looks at me and says, you mean you people don't have your act together? And I said, well, we're not 8,000 years old. She said, okay, it's all right if you're currently immature but if you're not trying to improve the immaturity, then you're just whining. Don't whine, good, no problem. But let's be real. Your data is likely a mess and it requires some professional administrations to make up for its past neglect. It's not just that you haven't changed the oil in your car for the past 10 years but you're still driving on the same bald tires and the same worn bearings and you've never tuned up the engine so you're a mess to the environment. It's a hazard in many organizations. Don't be hysterical about it, it doesn't help any but realize that because you can't just say everybody's got to do it better because they don't know what it is. That's your job as a steward, is to show them specific examples of how doing things differently, just small, simple things differently. Let's not get the data from here, let's get it from here. Or more importantly, maybe we'll put you on a subscription service so that you'll have it delivered to you and we can verify receipt, right? These are all really interesting ideas but your folks don't know right now which means you're the expert. If that scares you, look at it as an opportunity. Glass is half full rather than half empty. It is likely that you will need a new business data program underlining the word program there. It's not a project, you're not given a 90 day initiative to go out and fix things. The cycles, the theory of constraint cycles that I alluded to and described in that one chapter with the picture of the book on it, those are projects but it needs to exist in the context of a program because this joint projects and data are not helpful. It requires a synergy, a critical mass. It is a step function. You do not dabble in it, you go straightforward. Because data strategy, information management are the major components of what goes on in data and you are the people who are tasked with doing this. You need to do three things. In concert, collectively, you need to improve the organizational data by taking specific concrete steps to move forward in measurable ways, writing it down as a success process for your own protection. Because if you don't put it down, it doesn't work. Now I'm gonna do a quick side note here and bring my mother into the picture. Why the heck not? I've already called out my wife so let's bring mom into the picture. Mom was a purchasing manager at Planning Research Corporation, which is the D.C. belt we've handed for many, many years. And as the manager of purchasing, she always kept metadata about how well she was doing. So people would come down and ask her and she'd say, look, here are the industry averages and here's how we do all against the industry averages. And they'd go, I guess you know what you're doing. And she said they never bothered her, which is almost amazing in a corporate environment, in an area where it's subject to re-engineering time and time again, going through that process. Anyway, back into here. Improve the organizational data. Improve the way people use the data. They haven't been taught how to do it. Their PhD may be in statistics, but that doesn't mean they understand application of data. They understand algorithms, not the practical sense of it. But you'll hear this said every day that all of our really good data people spend 80% of their time munging. That's M-U-N-G-I-N-G munging the data. That's not a good use of their time. There are separate skills that can be developed to do that and show people how to support this better ways. Because only when we've got better data and better ways of people using the data, better capabilities of the organization with better data and better people, then can we use data to support strategy? Otherwise, it becomes really hard. And unfortunately, our statistics around data usage are that it has continued on the same pathway that most data projects initiatives have been, the successful is most IT initiatives, which is not very successful. And in that context, it means it's really, really hard to show tangible value because you haven't got enough runway length to get the program off the ground in doing this. So to conclude all of this, the only way you can accomplish this is by going through an iterative approach, focusing on one aspect at a time and applying formal, formal transformation methods. Crawl, walk, and then run your way to the top. Now, one last slide on this, just some kind of funny chapters. Oops, sorry, I don't want to hear the music there. By the way, you can Google this. It's called Data Governance Council, Council Data Government. Go Google it, you'll get a copy of this. It's playing right there, so California in the background. Anyway, real quickly, because we've just got one minute, 10 top data storage practices, you get buy-in to this, but not commitment. And again, there are issues with business versus IT. These are highly problematic issues. You've got to make sure you've got executive support, but everybody needs that. So the reason, the way you make it executive support that works is you make the executive appreciate what goes on, who says, fire the data stewards? Mine saved me money every year. I would never do such a thing. Two, ready, fire aim. People buy technology and rely over on it. They don't do enough thinking. The Einstein quote on this is fabulous. What it is, is that Einstein says I would spend, if I had 59, excuse me, a minute to live, I'd spend 59 seconds focusing on the problem. And one minute implementing the solution. Don't know whether it would have made it, but it's a great example. Trying to solve all the problems at once, not gonna work. Now maybe if I do too much or too little and just split the dime in the middle, that'll work. Not in this case. Again, step functions are problematic in this. If people are sitting in committees, wondering what's going on, you're already at risk and you need to readdress. If you don't implement, if you do just plan, plan, plan, but do not show tangible results, it is a problem. If you don't deal with change management, which is critical on this, again, assuming technology's the answer, building an ongoing sustainable process and ignoring the data shadow systems, which is what actually is occurring in there. So as Shannon mentioned before, we've got a bunch of upcoming events, data architecture summit in Chicago, October next month, and data governance vision in DC in the ninth or two really good events. We're gonna do three hour activities of those ideas, data architecture boot camp and rekindling data governance in China. I will shut up because I went 28 seconds over. Back to you, madam. 28 seconds. And you know, I was ironic there as well because what I was going to go was say, oh my God, I've been talking to nobody the time. She dropped off the line and I think completely disconnected the whole time. No, we'd have complaints. Just to answer the most commonly asked questions before we dive in here, I just reminder to everybody, I will send a follow up email for this webinar by end of day Thursday with links to the slides and links to the recording along with anything else requested. So to kick us off here, Peter in our research community's compliance data warehouse, we use a crowdsourcing model of data stewardship. This originated from a relatively small staffing ability at our headquarters level data management division. We have found success. Any comments on that? Fantastic. And most importantly, how can you share that story with the larger community? Because there are wonderful crowdsource opportunities. State government in particular is very, very good at this process. But there's no place for us to go to find out. So I can guarantee you 10 people on this webinar are going, oh, I want to find out more. Put your details in the chat part and then move the discussion over to communitydativersity.net. And that's what we want to do. So I hope it's favorable comments. Beyond that, it's an excellent way of doing it. It still relies on volunteerism, but gosh, there's a lot of enthusiasm in our community, which is wonderful. Absolutely. And a follow-up question to that, with this approach, is a formal data stewardship position something that we should consider for the future? Yes, so what we're trying to do with data stewardship as a practice is to have people who literally are experts in the data. They may not understand algorithms. These may not be data scientists, but what they do understand is the context, where our data come from and where it is being used. And they understand how the organization does it and how important it is. So if somebody says, well, I'm gonna turn that off for 24 hours, they go, ah, that's $6 million in sales. So you really want to do that. Again, I'm making up an example here in order to do that. But the idea is let's keep this focus on things that are tangible. Because if it's all academic, it looks like a science experiment to management. And we most definitely do not want that perception. Love it. So any suggestions on how to get data stewards that are assigned by functional areas to think outside of an application and only on a data level? I love lunch and learns. Do them all the time. We tend to all eat lunch. Sometimes it's fun to sit and listen and learn some stuff. So a lunch and learn in organizations that I encourage them to do is to have a steward community periodically get together, kind of like mini-damage chapters or diversity communities. And talk about what's going on so that you learn something. I have literally had research scientists who worked together but never talked about their work because that was working, they were friends. And until one went to a conference and saw the other one get up and deliver a paper, realized that he had invented chocolate and his friend had been working on peanut butter the entire time. And they should have had this conversation 10 years ago because we love the combination of chocolate and peanut butter. Be hungry. It's lunchtime. What would be the first steps towards data stewardship in an organization with no skills and resources to do data stewardship activities? Data awareness is low in parts of the business and parts. And in parts, there is reporting work. It helps to have a short, wonderful message sent to everybody by the chief executive of the organization. Sometimes you're able to achieve that but I have literal seeing organizations pay hundreds and in one case thousands of dollars to craft a short one-minute animation that was cute and introduce the concept of stewardship to the organization in general and says, hey, it's not that we're gonna go from the, you know, the non-data stewardship organization to the best data stewardship organization overnight, we're gonna ease into this process and things are gonna be different. We're gonna try to make some changes gradually and we want you involved in the decision-making process. All of that can be crafted into that one minute message but it probably takes 59 times that one minute to come up with content that is useful and digestible by your organization. See, governance and therefore stewardship is personal to each organization. You cannot prescribe data governance for every organization across the blanket because organizations and groups within the organization are at different levels of maturity. Very, very difficult to keep all of that together. So it's about reading the room and figuring out what's going to work in the room and making it a repetitive process so that you can get better at the pieces. Let me just give a quick example in parallel here and apologize for this sounding like a commercial but I wanna do it just to shout out to my guys. So at Data Blueprint, we have a number of product service offerings that we do and we know how long it takes us and how much effort it produces to do this and if we go and do X amount of work, it costs the customer Y and it's a mutually beneficial transaction. We are getting better at that because if we do the same thing and it takes us less then we end up being able to save the customer more money in the long run. Well, this is the same process stewards are involved in. It's very entrepreneurial particularly at first because we don't have industry-wide numbers that say organizations with data stewards did 20% better in X than organizations without because we haven't got even a baseline to measure what X is, much less what the actual improvement is that over time. I went to the professor on you there, Shannon, sorry but that's a great question and I hope that answered it. No, it's great. I always love the longer answers and details. And do you have recommended prefer metrics to measure the success of data stewardship within an organization? You just cued that one right up for me, didn't you? No, we have surveys that say how many organizations have stewards but comparing them as apples and oranges, we just do not have a basis for comparison. Again, many of the leading components are registered and you can search data diversity for other topics. There are a number of my colleagues that do really excellent work in this area. I've mentioned Dave Plotkin already, there's lots of others that are really good about this but no, we have not come up with this definitive set of measures and that's a problem. Accountants can look at how long it takes to close the books each month. Just imagine if they only close the books every once in a while infrequently without warning. That would not work out really well and that's probably closer to what our data world works like. Works like, there we go, that's a good thing. All right, better shut me up Shannon, I'm battling now. Or can you do improved data quality or such? Gosh, you know, it's funny, great suggestions. So again, a stewardship outcome can be we have data that is of poor quality. Actually I can tell you a story for the very first time here because I just got permission to use it. I have a friend named Mel and she's a data steward for an organization and the rest of the group to welcome her in said, by the way, as the data steward of this organization, you now gain control of the data set that sucks. And Mel kind of went, boy did I get lucky in this new job thinking to herself, this is not a good move. And she said, well why does it suck? And they all kind of looked at each other. This is her coworkers in the organization. And they said, well, we don't actually know. You know, it was given to us and we were told it was sucks but we don't really know anything about the data set. It's simply yours now. So I guess if you wanna find out why it sucks you'll have to figure it out yourself. Well, if you know Mel, she is a very determined individual. She dug in with both teeth, made a meal out of the process, came back and said to the same group, assembled two weeks later, hey guys, you know that data set that had, it sucked? Guess what, it's 98% accurate. And the room was dead and quiet. You could have heard a pin drop. And from that point on, literally this was a game changer in her organization. Mel is the lady with the 98% accurate data set and 98% is far higher than they ever thought they would ever achieve in that data set there. So it is possible to develop metrics but we do not have them across industry yet. Our organizations collectively, these communities that we've spoken about on the call here, are the ones we want to use to develop these. If you're interested, hit me up, I'm happy to speak with you. If not just jump in the day I'm international or data diversity, excuse me, community.dataversity.net. Love it. Peter, do you believe that training of data service is best at in-house or by external providers? External providers can give you frameworks that we've talked about in the set here. Working with providers in an actual customization of a framework into something that works for your organization is something that is generally very useful the first couple of times but the goal is to sort of quickly work yourself out of a job in that sense. So I think there's value in it but I would also be extremely cautious about organizations tend to like to develop repetitive products. I just told you how our organization likes to develop repetitive products. There's a known quantity that you can put in place squishing this less risk in order to do this. It's very important to try and get that as quickly as we can in stewardship by having repetitive activities. I'm not talking about mind-numbing factory activities but I'm talking about the process of responding to a fire as a proactive activity of a data steward community. So it's just the organization gets better about the process. If the expertise is diffused for the organization it will never become useful but if you concentrate that expertise in the center of excellence in stewardship or something along those lines you have the ability to then take that organizational capability and get better and if you look around and see well-performing companies you will almost always see somebody from that company presenting at data-versity events and see how those things work out. Another great way of sharing information among us in the community. Gosh, do I have a theme going today or what, Shannon? Yes, do you assign data stewards or hire new ones? Digital natives are generally quicker to usefulness in organizations but people that understand business practices and architecture and can think in systems terms are also useful. So generally a combination of the two. You may hire those digital millennials or digital natives outside of the organization or you may ask your existing folks to bring in their kids because they're just as likely to get good ones out of the process. We do it all the time. It's a great trick to work with people. Again, hiring new ones can make sense but if somebody says they were a data steward at organization X for three years the next question I would immediately ask is and tell me what you did in terms of moving the needle here. What were the tangible accomplishments that prevented you from becoming redundant, as the British say, and listing those out will be a very, very strong sign that the individual is focused in the direction that is likely to produce good results for your organization. Absent that. You may have somebody who likes to sit around and then have conference calls and do things but really our goal as stewards is to move the needle. It is not to sit around and pontificate. Ah, now I've cast a dispersion of professors, right? It's a great word, pontificate. What is the functional difference if any between data stewards and data coordinators? Okay, stewards can perform a coordinator role but that's a limited subset of stewardship. Coordinator implies that one is actively gaining access and this may be an instance of your organization has customized that more so than the general population but coordination implies, you know, somebody needs something, you help get it done. Stewardship would also bring into the picture whether it should get done. Ethics are a very strong component of stewardship. I should add that to this deck and put it in there because that's bad on me to leave it out. Stewards have a fiduciary responsibility and that fiduciary responsibility is key to driving the stewards in the right direction. Now, it's not that people are always going to do the right thing but at least stewards will make sure the question is incorporated into the process and that is critical. Great question. Thanks for prompting my memory on that. So, Peter, any thoughts around which stewardship models work best, central, federated, et cetera? Yeah, we had a paper that was very famous in the late 1980s about whether centralized data processing was better or worse than decentralized and then of course the answer is a Tversky answer which is that the idea is under which circumstances is best for which organizations and I think the answer to these three models that you've described is not going to be A, B, and C but a federated model is more complex to administer and maybe is appealing to organizations as a second step after they've gotten their feet wet by involving a limited pilot scope and learning some lessons in private instead of in front of everybody else to do that. Again, trying to say it best way is that each of these have strengths and weaknesses and your job is experts in your organization and in your organization's data is to try to figure out how can each of these models improve the way I apply my data to the organization's strategy and save lives, save money, whatever it is, improving effectiveness and efficiencies, limited number of things, oh, faster. They're faster, better, and cheaper. That's the way we want to do it. How do you determine when is the right time to transition between X number of part-time data stores to Y number of full-time data stores? This is where business cases are extremely helpful. The process of saying I've got, again, I'll use an extreme example. This is probably not correct for your organization but many organizations are challenged with a variant of this. Stewardship is everybody's responsibility, which really means nobody's responsibility, but everybody in our group, 10 people, all develop 10% of their time to date stewardship activities and we've got it covered. All right, well, it is possible to measure outcomes. It may not be possible to measure the entire outcome, but if you measure part of the outcome and part of the outcome is still a large number, who cares whether you've got all of it or not. I have an example where Stewards saved an organization in the B billions of dollars. They are heroic within that organization. Great story, have to save it for some other time, but the idea here again is what can we do by building a case? By focusing in on something specific that allows us to produce value from the organization because otherwise, again, just an industry statistic, half of the chief data officers that exist today currently have no staff or no budget. It's a problem either way and we are still not where we need to be in terms of being able to effectively do our job so it's trying to do more with less, but if you measure the output from the Stewards on the part-time basis and then request permission or don't request permission, lie and say, I've got somebody working full-time for a year and compare the numbers, I think you'll find that the increase in productivity, the increase in organizational capabilities is noticeable from the full-time concentrated effort, because first of all, it gives you as a Stewards somebody to work with and you generally, if you're both motivated in the same direction, will be more productive as a team than you will as an individual and that other individual can now learn the process as well. Anyway, long-winded answer, but hopefully that gives an approach. Most definitely. And if you have questions, we still have 15 minutes. Feel free to submit them in the Q&A section in the bottom right-hand corner of your screen. So Peter, in an organization where data stewardship does not exist, how do you identify and assign data Stewards? That is a great question. So you can try, of course, the overall process of saying, well, we need to develop a data Stewards organization and it's a big, big issue and we've got to really think about it and let's all get together and figure out what we're going to do and all that sort of thing. It's hard. I've worked with a lot of organizations from zero to one, right? And to get there, and then truly, actually just for full clarification, it's not zero to one, it's one to two. But from one to two, but one being we have nothing and then two meaning we're at the lowest level of the framework, maturity curve on all the measures in there. The idea is getting that started. If people say it's working now, why should it be a problem? And then what I tell people is that what we're selling in the data Steward community and really in the data community as a whole is shoes. But we're selling shoes to a population that has never understood the concept of putting something around your foot. So we get to places. I go outside and I step on acorns and stones and gravel and my feet are tender and over time they get stronger and I can go further, I occasionally cut myself. But we get things done. And I'm over here in the corner, the shoe salesman, Al Bundy, anybody get that joke? Anyway, the shoe salesman who's saying, look, if you put these on your feet, things will not be easy, but you will go further. It will take you less time. It will cost less and it will deliver better results on the other end of it. And everybody says, ah, it's okay, we're doing fine. We've got this feet thing down pretty well. We don't need those stinking shoes. And it seems obvious to us that are in the community but it is not obvious to everybody else. And the way we make it obvious to them is by finding something that is meaningful to them and the organization and showing how better treatment of data will help to resolve that. I have gone through a hundred IT failures in depth as part of a research program. And literally all of them, of course, involve a data component. And even if it's just a sub component of it, it's still a lot of data problems that could be solved by a more unified approach. That is what we were lacking, but we're moving in the right direction, trying to coalesce the community around these ideas. Peter, we plan to hire analysts for full-time data stewardship and use the data office as an incubator for their data training. They will participate in an accredited program and work with an agile teams. Do you know any examples of success and any lessons learned? I have about 10 organizations right now that are attempting the experiments that I've described to you on this about working within the agile context. But the first piece of this has to go to literally the definition of the word program and the definition of the word project. And so an agile sprint cannot result in changing data requirements or data requirements that are less precise at the end of the sprint than they are at the beginning of the sprint. And if you use that as a guideline, I think you'll find that agile works very, very well at doing what it does best, which is creating better, high-quality software faster effects around the process. And you need a data program that is synonymous with your HR program. After all, nobody in your organization says, we've done a whole bunch of data around here. I'm sorry, we've done a whole bunch of hiring around here. I don't think we're going to hire anybody else, right? We just, we're done with it. HR people, you guys can go find other work that's, you know, within the organization, but we don't need an HR function anymore. Of course, that's ludicrous. And yet we have exactly the same role to play organizationally as not recognized. And until we as a community pull together stories around this, this is something else that will help with your motivations as well. Does your organization have a burning bridge issue? Many organizations have an oopsie, you know, Equifax for example, a little oopsie there or Marriott hotels, right? Lost my passport number. These are interesting things. By the way, the Marriott passport numbers, interestingly enough, never showed up on the dark web, which means they're probably in the bad guy's hands rather than in the hackers community's hands. That's a story that might be very interesting to management to pay attention to this. The target board of directors was attacked by shareholders for mismanagement of the data assets during the target break in. These are things that can get management's attention, but hopefully not in a threatening way. Hopefully you guys can make a proactive case on that. Of course, if you need some help, give us a shout. Happy to help. Perfect. And are data storage typically senior managers or specialized data analysts or SMEs? And it's of course a typical answer of it depends, but let's go back and look specifically. And let me pull the chart up and if you'll reread the question to me again so I can fix the, oh, there we go. So you guys will recall I was just babbling and had, you know, the way we define this IT is the pillar and we've got some attributes of each of these, but, you know, fall different areas within here and here's our four pieces up here. So again, leadership, stewards, participants, experts and other sources and uses. Shannon, the question ones. Yes, so are data storage typically senior managers or specialized data analysts or SMEs? And again, that's going to vary from organization to organization. There is not a standard practice. What you will find in organizations is that they are people who are passionate about this. Larger organizations decide that stewards must resign at a job classification level so then they say well only if you manage 10 people can you be a steward and, you know, other arbitrary rules around this as well but aren't really enforced by any good scientific basis for doing this. Your data stewards should be the people that you trust to manage data as an organizational asset in support of your strategy and they should make sure that actually gets done. They are not necessarily the doers but they are the people that are making sure it gets done. If we put a racy matrix up you'd see how that works but I don't have that slide prepared. So could be senior managers but I've seen organizations where the stewards actually reside out on the field and are fairly low. The key is who are the people that need to be acting as trustees and that trustee implies a fiduciary relationship and that fiduciary relationship puts them in the class of accountants, of lawyers, of other people who have to help professionals who have to protect information. And Peter, can you recommend a certification in data stewardship? I know some companies have certainly created their own. I'm aware of a specialized government stewardship certification for the government that is run by a private company in conjunction with the ICCP. DAMA also has some certifications that are available. Again, what we're lacking overall is the official decision as to what is a standard around these areas. That's a community discussion that simply hasn't occurred yet. What I would look for if you're seeing private assistance in that area is that they draw from what are clearly the a priori standards in the field which is the data management maturity model from CMMI and the DIMBAC from DAMA in those areas. If they're not using those, I would really question what is the basis for the certification and that. For example, just a couple interesting ones that there's a certification for Chief Data Officers out of George Mason University. George Mason is my alma mater. I'm thoroughly pleased with that. I think it's out of the public policy school as opposed to, you know, perhaps a data engineering group of expertise in there. And these are going to influence these choices which is what's going to make all of these things work within the organization to the degree that it actually works with what you're trying to do. Anyway, hopefully that made sense. You'd tell me if it didn't make sense, right, Shannon? If I was just sort of word-sallotting over here. Yeah, and I know that the community here will let you know if it doesn't. I'm very helpful guys, you know, typos and slides, even though they're so useful, sure. Our community is not shy. So, yeah, one person in my org that is doing data governance, working with a group of service line leaders to build a dictionary, how can we show value? I think this should be multi-dimensional to show value. The value in a dictionary is not showing through. Great question, a very practical application because guidance around this is that if very difficult to actually do things good with your data, if you can at least name it, that's what a data dictionary is all about, starting in glossaries or dictionaries or metadata repositories, however you're deciding to approach the process of writing things down and formalizing more what actually works out on this. So the context here is that as you're trying to look at each of these specific instances of defining things, don't again start at A and say, we'll work our way through until we get to Z, but instead focus in on something that directly affects the problem and then take lessons from that process to reapply so that you can get better at what it is that you do. Shannon, I think I wandered off there, would you reread that question real quick? Sure, there is one person in my organization that is doing data governance, working with a group of service line leaders to build a dictionary. So how can we show the value of a data dictionary? Because it's not current or valid. Great question, yes, as I said before and sort of lost the current thought. Value of the data dictionary, so I wouldn't try to quantify it. I did spend the summer of 1993 in the basement of the Pentagon literally shoveling metadata into one of the DOD data repositories because somebody had promised that there would be 17,000 terms in that dictionary by August 17th. Well, I remember that, I'm not sure, probably the alliteration. But anyway, just having numbers of data items in the data dictionary is good, and that's comprehension. But having a couple things defined well and showing that defining those things well improved the business practices, support for the Russian in these tangible areas is better. And so rather again, than trying to be comprehensive about the process, focus in on a narrow area and show how bringing clarity and definition and more importantly, accurate information about the data to the process has helped the organization improve its productivity. Again, let's just take an example here completely out of the blue. If law enforcement understood that facial recognition software had a 10% misinterpretation for many types of individuals, they would be probably more careful about its application. And yet data and policing is an important concept that tends to be done in another part of the organization, whereas what we really need to understand is how the folks on the job use this to do the protection submission that they have. Does it make sense? 10% of errors here, I would be a lot more careful about what's actually happening if a 10% error was done. Now interestingly enough, you can Google this and find out that they did it for the, I believe it was the San Francisco Board of Supervisors and came up with those statistics. And they did it, I think also for the U.S. Congress and got similarly disappointing statistics. It's not that facial recognition doesn't work but it has its strengths and its limitations and needs to be used and understood in this context which, as I said before, also gets us to some ethical considerations. But gosh, Shannon, we are almost at the half hour so I'd better shut up or you will get frustrated with me and just pull the plug on us, right? All of a sudden we'll just go quiet, beep. Indeed. Well, that does bring us to almost to the half hour. I don't know that we can get, there's a couple of other questions here but I don't know that we can get to them in less than a minute here. So I will pass those on to you, Peter. Maybe we can get a couple of written answers for the follow up. And just a reminder, I will send a follow up email by end of day Thursday for this webinar to all registrants with links to the slides, links to the recording of this presentation along with Peter's information as well. And if you need a conversation and networking, as Peter has mentioned, then you can go to community.net. And thanks everybody to all of our attendees for being so engaged in everything we do. Just love it. I love all the networking back and forth and the communication you guys have going on is just the best ever. And we have a full fall ahead of us. Oh, I love it. It's awesome. And we've put together the 2020 series already. So I'm excited for that. All right. Well, thank you guys for participating in this. Thank you, Shannon, for hosting us as always. Thanks, Peter. Have a great day. Thanks, Bob. Sure.