 Hello and welcome. My name is Shannon Kemp and I'm the Chief Digital Manager for DataVersity. We'd like to thank you for joining today's DataVersity webinar, Getting Restarted with Data Stewardship. It is the latest installment in a monthly series called DataEd Online with Dr. Peter Aiken. Just a couple of points to get us started. Due to a large number of people that attend these sessions, he will be muted during the webinar. For questions, we will be collecting them by the Q&A in the bottom right corner of your screen. Or if you'd like to tweet, we encourage you to share highlights to questions via Twitter using hashtag DataEd. And if you'd like to chat with us or with each other, we certainly encourage you to do so. Just click the chat icon in the bottom right for that feature. And to answer the most commonly asked questions, as always, we will send a follow-up email to all registrants within two business days containing links to the slides. And yes, we are recording and will likewise send a link of the recording of this session as well as any additional information requested throughout the webinar. Now, let me introduce to you our speaker for today, Dr. Peter Aiken. Peter is an internationally recognized data management thought leader. Many of you already know him or have seen him at conferences worldwide. He has more than 30 years of experience and has received many awards for his outstanding contributions to the profession. He has written dozens of articles and 11 books. The most recent is Your Data Strategy. Peter is experienced with more than 500 data management practices in 20 countries and consistently named as a top data management expert. Some of the most important and largest organizations in the world have sought out his expertise. And Peter has spent multi-year immersions with groups as diverse as the U.S. Department of Defense, Deutsche Bank, Nokia, Wells Fargo, the Commonwealth of Virginia, and Walmart. And with that, let me turn everything over to Peter to get today's webinar started. Hello and welcome. And welcome to you, Shannon. We were just chatting before we got started about we've been on this for so long. We're starting to feel, oh, do we actually dare to say it old? But anyway, real pleasure to be with everybody today. And the topic here, I hope, appeals to you if you haven't started a data stewardship program. But if you have, maybe that there's some things that we can do in here that will help point you to where things probably will be a little bit more productive. Because, of course, as Lewis Carroll famously had the Cheshire Cat say to Alice in Alice in Wonderland, if you don't know where you're going, any road will get you there. So started or restarted, I think this is some things that we'll talk about today, specifically definitions, what it means to be a steward and what it means to steward data. This has to occur in an architectural context, and we'll talk specifically about some of the unfortunate confusion that you will encounter as you're attempting to do this. We'll talk about challenges around the educational area and we'll finish up that first section with a role of strategy that essentially talks to the why of data stewardship. When we go to the next one, however, how what we're really looking for at this point is to specifically start to look at relationships within governance and we'll get those out. And I like to talk something particular about a fire station model. So we'll see how that fits into the whole thing because part of what you're doing is going to be reactive and part of it should be proactive. And that's what we're really trying to do here is to get those two parts in. Then we'll talk about how your stewards have to work and that is within something called the systems development lifecycle concept. And we'll talk about different cadences that have to occur within there as well as a real challenge in terms of the structural approach. We'll talk about some foundational prerequisites and most importantly, and the thing I'd like you to take away from this is to not over engineer your first version of stewardship or you're in your second. Maybe by the time you get to your third one, we'll do that. But the real need for all of this is a drive for simplicity because while we may understand the nuances that we're talking about, our external audience is not nearly as familiar with these concepts as we'd like them to be. So if we were in a class situation, I'd actually ask you guys to ask your hands, you know, show of hands, how many are you starting this for the very first time and how many of you are restarting this. And it turns out usually it's about 5050 each time as we do this. The next piece around this is that we look constantly at people who are doing steward type activities. And while these are wonderful, they again have a real good sense for what they're trying to do, but unfortunately it's local. So when you see something like this, which was a wonderful little meme on the web a little while ago, the one person at work you can't possibly live without. And that's certainly a true statement in there, but it's not the best way to introduce these topics to people. Here's another probably more realistic perspective of somebody trying to do this in a healthcare situation. And they're looking at some lists and seeing, oh, I'm left-hand side here. I've got laboratory codes, L-O-I-N-C, right? And the steward for that, they may be able to look up and see as a lab director or something like that. But this is how most people outside of us on this particular event tend to understand how data stewardship works. So let's drop back a little bit further on this and specifically look at definitions, which is always a good place to start out. A steward is a person who has or looks after passengers on ships, trains, automobiles, you know, stewards, right? Not exactly where we want to be in terms of what we're trying to do. Another one, an official, appointed to supervise arrangements or keep order at a large public event. Since we're not having many large public events, that's not a good one. How about a person employed to manage another's property? Yes, there we go. Custodian, caretaker, steward of the estate. And what does steward do? Well, they can get keep, manage, look after types of things. And therefore a data steward in particular is going to be looking at managing the data assets on behalf of others and in the best interest of the organization, quoting to net on that particular one is a great definition. And they represent the interest of the stakeholders and take really an enterprise perspective as we're looking at this. They've got to have dedicated time to be accountable and responsible. And there is a component of trust in here. If you're in a steward relationship, somebody is trusting you. So we need to throw that concept in. One last concept as well, fiduciary. So if you gave critical medical information to your physician and your physician breached your fiduciary trust by telling somebody else something about your medical condition that they wouldn't have, we all know that's a HIPAA violation. And that's where things go because we've been working with HIPAA now for about 10 years. And quite frankly, nobody really understands it. That's a different story entirely. So when I said start off simply, I do want you to do this. There's a great book out there on data stewardship written by my friend and colleague Dave Plotkin. And he puts in there a whole bunch of different steward types. There's a business data steward, a technical project. I won't read them. You guys can read them. And my point is this is not the right place to start. Because when you start off with that many, the next thing somebody will figure out is, oh my goodness, you've got that many different types. We need an auditor to be able to go back and do this. And we need a manager so that the manager will take over it. So there's David's book in the upper right hand corner. We've got a very good book, but not the way I would start. Because if you're trying to figure out how your stewardship plan is going to look in five years, it's probably not the greatest exercise for you to go through. So let's even this simplify and say just as a steward, let's take a look. One who actively directs great definition from Webster's. And if we add the data component, we can add one who actively directs the use of organizational data assets in support of the specific mission objectives. That's exactly what we're looking to do. And there are now days of the international stewards where we're actually making some very good progress around all of these. So the question becomes, what do data stewards do in our organization? And they should be looking at improving the organization's data asset value and advocating or evangelizing increased scope and rigor around data centered practices and ensure efficient and effective data management practices. Now all this starts out in our dimbok wheel and we're looking really in here at the center of the wheel, the data governance portion. If this is the first time you're seeing this, you need to attend some of these others so we can tell you a little bit more about the dimbok wheel on this. Let's talk now about governance and architecture. Corporate governance is something that has become more and more popular in the papers today. And I've got a couple definitions here for you, but there's a more interesting definition of all of this. And that is that interestingly, a bunch of CEOs got together in August of last year and said that our primary objective is no longer to make money. We also have to incorporate some social good into our work that we do in this. So governance from a corporate perspective is continuing to evolve. And that's actually a great reason why you guys should not let anybody try to pin you down with what is governance going to look like in five years because we can't even tell you what corporate governance is going to look like in five years. Well, of course, if we've got corporate governance and you know where it's the bottom, we want data governance. Let's talk about IT governance specifically. And the focus here is on how organizations align IT strategy with business strategy. After all, if our strategy is to go mobile and we don't have a good technical infrastructure from mobility, that's going to be a problem. We're also looking specifically at some measurable results. And the measurable results should really focus around a few key questions, not a lot, a few. I can remember joking with a colleague recently where we worked with an organization that had 1400 key variables they wanted to measure and it's just too many. So IT governance focuses on strategic alignment, value delivery, resource management, risk management, performance measures. All of these things are going to be important as you start to look at governance. So why do we need to do data? Well, let's just talk to the starters about the volume. The volume is enormous in many organizations. If you're doing something a billion times a day, if you can make it a little bit better, times a billion, it's going to add up here. So lots of statistics. Your organization probably has some very similar statistics and they're really good reasons for making sure that you do in fact have people charged with going to interview all this. Now, I mentioned we're going to throw architecture in here, just a very brief architecture lesson. Architecture is about things and the function of those things, what do they do and how those things interact. And again, we have lots and lots of different ways of describing this. The really interesting part, the call I was on just before this one was was fascinating because my colleague Chris Bradley tells the story that he likes to tell, which is that he had a CEO come up to him at one point and said, we don't really need any data architecture. As long as I just glue an SAP together, we will have an architecture and by golly, he's absolutely correct. Your organization has an architecture, whether you like it or not. The only question is, is your architecture able to be understood? It cannot be able to be understood if it's not documented. And of course, if it's not understood documented, it cannot be useful. Now, of course, we want to talk about these data context as well to make sure that everybody's talking about them. We do have some important functions though. I'm going to show you an old joke, but it nevertheless tells you about architecture. Amazon coming up with a traditional architecture. Google team of three. Again, you can tell these are old, right? Facebook, do we really have a structure? Microsoft, we're trying to eliminate our own products back and forth. Apple word involved around one individual on this and Oracle, which buys one company after another and really has a bigger legal team than an engineering team. Again, according to the joke. Now, each of these organizations is going to have different architectures, different components, different ways in which they go about the entire process of doing their value ahead. Organizations should in general manage architectures along these lines, process, systems, business, security, technical data, of course, and information at the bottom of it. If your organization does this well, first of all, understand that you are already in the top 10% of organizations that are performing. Most organizations do not even attempt to do this. So the fact that your organization is attempting to run one or more of these as a formally managed process is fantastic. And yet, if management understands that you have people sitting around in meetings and they're not sure what the value of the people in those meetings is producing, it is a big challenge. So we want to make sure that people don't get the perception that your stewards are sitting around in meetings all the time being bored. Another really important aspect of this is the concept of understanding and I'm showing the tower of Babel here. The question, of course, is understanding this, we need to make sure that it's understood not just between us and the computers, but also among the computer components as well. And again, the stewards are going to be the ones who are going to provide those very, very useful services in order to get us there. Now, I mentioned earlier that we were starting as an early kind of a process here. We've not been doing data governance for nearly as long as our colleagues in the accounting profession, for example, who've been literally doing this for millions of thousands of years all the way around. So consequently, we're not even sure what the proper definition of data governance is. These are seven popular definitions put up by people that I respect and who have tried to do a good job. But I find that they don't make for good elevator pitches. Again, you can imagine stepping on the 20th floor with the boss and saying to him, I do data governance and I'm steward and I help to exercise the system of decision rights and accounting middle of these for information related processes, blah, blah, blah, blah, right. And the boss is just going to look at you and I'm not sure what you're talking about. So I like a simplified definition of data governance and my simplified definition is pretty straightforward. Managing data with guides. Now, there's two things that help us out here. First of all, if you give this definition, immediately somebody asks the question, goodness gracious. I wonder why we would want our soul non deplatable non degrading durable strategic asset managed without guidance doesn't seem like the best idea. I'm going to add one more piece in this when I'm talking to managers, I talk about managing data decisions with guidance and that's important as well, because we all need to be speaking a common vocabulary around this. And unfortunately, we have some cleanup to do in order to get this done correctly. The cleanup involves a lot of confusion that we've had in this industry. It has thought for years and years that data is a business problem and their attitude is summed up by this very pity and unfortunate statement. If they can connect to the server. My job is done business on the other hand looks around and see somebody with a title chief information officer and says who else would be taking care of the data. And of course, you can see that data falls in this enormous gap that we have between the business and it and that's something that we need to work to replace. So part of the stewards work is to come and fight against this perception that data has either been taken care of and we don't need to do anything with it or data is a real mess. And the answer of course is somewhere in between. I don't usually show Dilbert's on these but I'm going to do one here because it just so perfectly captures some of the things that we want to avoid as data stewards or try and simplify from our existing data steward program to make this new version of the program come out much better. So just to read it here for you the committee Ted says the committee decided that the file naming convention will start with the date in the order of month, the year and day. Of course, he's wrong already because he's got it backwards then a space and the temperature at the airport and the hat size of the nearest squirrel to be perfectly honest. It was a long meeting and we probably didn't do our best work towards the end again same thing here. We don't want your stewards to be stuck just doing mindful mindless meaningless stuff. We want them to be focused on specific things and more importantly, I'm going to say we don't want them, but we need them to do this. And the reason is quite simply that we in the academic community and I'm a professor at a university here. So I get to create throw stones inside the glass houses. We don't tend to teach knowledge workers anything about data. And of course, how many of them work with it? Well, the answer of course is all of them work with it daily. It's even worse when we look at what we've taught students in it about data. We give them one course how to build a new database. If there is a skill on planet Earth that we do not need any more of it is how to build new databases. And this unfortunately leads to an impression that our IT and our business leadership have been through these courses. And they've looked at what we've taught people and they've sat in those courses as well. And they say, I'm molding two SAP systems together. I do not need to have a data person here because it doesn't involve building a new database. They get the idea that data is a technical skill that is only needed when developing new databases. Let's take a look at what that results in. And the answer is quite simple, a bad spiral. In fact, it's a bad data decisions spiral, but I call it. If the business decision makers and the technical decision makers are not data knowledgeable, then we will have bad decisions resulting from the lack of knowledge. Those bad decisions will result in poor treatment of organizational data assets and poor quality data. That will lead to poor organizational outcomes. And if you're not careful, you will get stuck in a thing very much like it on the back of the shampoo bottle. Either rinse and repeat. Of course, we don't want to do that. And to break out of the cycle, we need to look at strategy. But before I do, I'm going to just add a, in this case, Morgan Freeman comment on here, which is a wonderful one just as this is bad. And when you think about it, this is wrong. It doesn't work and it hasn't worked and it's not going to work. Let's look at the role of strategy in here. I mentioned strategy is how we avoid the meetings becoming boring and useless for at least too long. So first of all, our use of the word strategy you can see here by the little Google graphic didn't start until the 1950s. Before that, it was exclusively derived from the military. But my favorite definition for the word strategy is a pattern in a stream of decisions. Let me give you two very simple examples and then a complex example of strategies at the organizational level. First one is Walmart's former business strategy. Every day low price. You've all heard this. Not only have you heard this, but every associate that works for Walmart has heard this over and over and over again. So if you get on the Walmart Express to the Bentonville Airport, the little regional jets that bring you into where Walmart headquarters is located, everybody on the plates will be talking about this every day low price. And the reason we all know it and understand it is because Walmart has been absolutely on message on this topic for many, many years. So pattern in the stream of decisions. Similarly, while you may not be hockey player hockey fans on this, Wayne Gretzky's definition of strategy is also a good one. He skates to where he thinks the puck will be. Imagine you're on ice, this little plastic thing in the middle of the screen there is running around faster than you. If you're chasing the puck, you're never going to score. But if you go to where you think the puck will be next, you have the possibility of scoring. That's the definition of strategy. Two easy ones. Let's do one difficult one. This is an example in this case, looking at specifically what are the ideas when somebody is looking at beating up two people who are bigger than you. Now in this case, the two people are two armies. The red army is the British. The black army there is the Prussian and the French are at the bottom. This is of course Napoleon at Waterloo. And the question of how does one end up becoming besting a superior force is dividing conquer. So let's take a look at how that would actually work out in a strategy situation. Napoleon in this case, and by the way, this example is still taught in at least American strategy doctrine in the army at this point as a good example of thinking about strategy. Napoleon had noticed that the astend port supplied the red troops and that the Liege depot supplied the black troops. So his goal with dividing conquer was to hit them really hard in the middle, just like I'm showing you there on the screen. And if I hit them in exactly the right place with his thinking, they will move back and apart. That's important. Back and apart. All right, so we've divided. Now we have to conquer the conquer part is everybody turn to the right and we'll beat up the Prussians first and then everybody turn to the left and we will beat up the British. And this is a complex strategy. First of all, imagine hitting armies both very hard at the same spot and then turning right and defeating the Prussians and then turning left and defeating the British. And then you hit the way while someone is shooting live ammunition at you the entire time. You see what I mean when I say strategy can be hard. You've got soldiers under fire. They've got to remember three things hit them hard exactly right. Everybody turned to the right and everybody turned to the left. And if we don't all do that, we're going to die. We won't make it quite that dire, but it is a problem. And let's talk specifically about whether strategy house malleable strategy should be. If I've got good guys on the left and bad guys on the right, I'm going to use a different strategy to defeat the bad guys. Then if instead I am here as the good guys and the bad guys down here or a contrary example, if the bad guys are up there and the good guys are down there. So all of our planning around that strategy was not really a very significant amount of work that we were putting in place. Again, remember it is a pattern in a stream of decisions. And if you have a strategy that's on the shelf, it's not going to be useful. Our stewards are the people who are following these data strategies. And we have to make sure that they understand every day low price in the case of Walmart in the case of Gretzky. Skate to where the puck is going to be in the case of army strategy, right. A pattern in a stream of decisions. So as we look at all of these things together, we see that data strategy provides a focus for these stewardship efforts. And our stewards are people who are attempting to leverage the data and they're going to leverage the data with a series of technologies. In this case, I'm showing you the technology is a lever and a fulcrum. And the process is that the stewards are going to be the people who help to do this. And if we increase the data rot, sorry, decrease the data rot, the leverage will be increased around that entire process. So that's the overall piece that we're attempting to get to when we're looking specifically at these. So stewardship terminology is not widely understood. We do not have yet agreed upon definitions. It will be up to you, however, to get the definitions within your organization so that you can create a de facto standard and have the stewards work effectively with the architectural components, focusing their stewardship activities around those specific pieces. Hopefully that gives you a little bit more about the why let's talk now about how in many organizations, I've had them draw these charts in one form or another on the board, which is the idea that we've got some data and we've got some knowledge workers, which are great. And they're trying their best to convert the data into information to add value to it. But this is overly dependent on human beings on what we call wet wear, which is the stuff between our ears. As knowledge workers, they go informal communications. And often this is described as the weakest link in that process. So when we look at that, what we really discover is that organizations have very little idea what data they have. They don't know where it is and they don't know what their knowledge workers are doing with it. And let's talk specifically about work groups, because work groups are how work group gets this stuff done. The data stewardship tends to happen pretty well at the work group level. In fact, you say defining characteristic of a work group if they cannot interchange data easily and effectively they do not really have much of a work group. And of course, without overall guidance in your organization, how likely is it that all work groups happen to be pulling in the same direction? The answer is it does not happen. By chance, it only happens through conscious effort. Consider the amount of time that your organization has allowed these work groups to spend learning about informal practices, which may or may not be the correct thing to do. The real value, of course, comes from cross work group connections, making everything much more smooth. And this data chaff, the problems with the data, become a thousand, a million, a billion little hidden data factories, to quote Tom Redman on this, that prevent smooth interoperation and exchanges. We call it death by a thousand cuts. The problem is your organizations aren't really dying. They're just not as productive as they could be. And a thousand cuts doesn't even begin to come close. Remember, I was showing your numbers a little while ago that we're in the billions. And the problem is because organizations lack the knowledge and the skills around how to do this, and specifically around how to be data stewards. Now, again, the chaff I mentioned before separating the wheat from the chaff, I hope you agree that better organized data is increased in value. Because if you don't, take a book that somebody wants you to get for them and cut the spine off of it and cut all the page numbers off of it and mix the pages up and hand it to them and see how happy they are. Of course, they won't be happy because poor data management practices cost organizations a lot of time, money, and effort in order to do this. And the reason I'm saying this is because minimally 80% of the data in your organization is wrought. That's a three letter acronym that stands for data that is redundant, data that is obsolete, or data that is trivial. My wife corrects me on this and says you should really include incomplete in there. And I said, bad enough, I've called your data riot. I'm not going to do that. I'm going to call it rot in order to do this. And of course, the question is, which data do we need to eliminate? It's not easy. A lot of recycling around this, and this is really where the stewardship efforts come in. We want to decrease the amount of rot. We want to reuse the remainder. We want to standardize the vocabulary. We want to do integration in a way that would make sense. But the data sets are not useful if we don't know what their characteristics are. So we look at this data is the most powerful, underutilized, and poorly managed organizational asset. It's the only asset that we have that is non depletable. It can't be used up. It is not degrading. It does not degrade over time. If you know that 42 is the answer to life, the universe, and everything, then that answer will not change over time. As long as Adam doesn't write another book, which of course he can't do because he's dead. I don't know what you're talking about. All right, let's get back on track here. Data assets, when you compare them to other types of organizational assets, are actually quite superior in many measures. On the other hand, many people say, well, great, data is the new oil, but think about it for a minute. Nobody thinks about what they're going to do with the oil after they're done with it if they put it in their lawn mower or their car. It is a production function and it is not designed to be released. For that reason, if somebody does say to you, data is the new oil, say you're thinking it's 10 years old. It was a good idea 10 years ago, but now we need to change our thinking slightly. The concept that a steward should be thinking of is specifically that data is the new soil. There's two aspects of soil that are critically important for this distinction. The first one is you don't just randomly run around the yard and spread seeds everywhere and hope good things happen. The second one is that you never plant things on Monday and expect to eat them on Friday. So there's time and preparation components in there that don't exist in these others. If you need to sell it as the new bacon, that's okay, depending on whatever it is you're trying to do. But as such, data needs to have a focus. If we don't focus the data, then the data is managed without guidance, so we don't want that to be the case. We want attention on par with similar organizational assets in order to maintain this. And finally, we do need to get something fixed to come back up to zero so that we can then start to build on top of it. Many organizations are not able to distinguish between these two types of activities. Let me show you how it works in practice here. When we look at organizational strategy, the data strategy is subordinate to the organizational strategy. The only reason the data strategy exists is to help the organization do better with the strategy. And yet, when we look at this, data strategy influences what data assets are needed in order to support the strategy. So if we're measuring certain strategic things, these will be the data part of whatever occurs in there. And of course, if we go back the other direction on that, we see how well the data is supporting that particular strategy. Now in Peter's world, data projects are superior to IT projects. There's another whole topic that we do to talk about. So if you don't like that one, we can talk about it during the Q&A part in there. And the question becomes how well does IT support that strategy? Let's put a couple of feedback loops in here and show you a picture that I would not recommend showing to your colleagues here, although it does show the overall operation. Let me make it a little bit simpler here and add one other component to this as well, obviously the stewards. But in addition to that, the data strategy should be specified in specific business goals. And the language that data governance should spend its time talking, the language that they should use, is in fact metadata. Because the metadata is how the stewards are going to go implement what is required by data governance. Let's look at that implementation in particular. When most organizations start this process, they've got a fairly simple feedback loop and the idea that they need to have some data leadership is an important one that they do. So they start out and read some things and find out that data governance is a thing they should do and that will improve data over time, no problem. That's terrific, although for some organizations that does represent a slow path. Again, imagine if you will, you're at the bottom of Niagara Falls and somebody said, it's okay, you can drink the water in a little bit because I cleaned up all the pollution upstream. Well, your question is how long will it take and the answer is some amount of time. So most organizations do not find it significant to just start off with data over time, but they want to do some things that specifically improve data as a result of the focus. We call these data improvement projects if you want to get real fancy about them, but there's nothing fancy about them. It's trying to get your data better. So when you have these structures in place, you now have a better set of constructs that you can use to build here. And we'll put our stewards, our community participants, our users up in here as well. And you'll notice the thing with the snails at the top and the bottom of it, I've labeled data things happen. We've been really good in the data community about describing the way the data things happen. That's wonderful, but you'll notice that the link between that and organizational things happening on the right is perhaps more tenuous. I've made the approximate signs in there in order to do this. We'd like when data things happen to have some sort of happiness occurring in the organization because organizational things have happened as well. And your data stewards are the best people to put on those types of projects because they should be able to speak both of those languages and say that when data thing occurs, it has a good value for the organization. Now the call a couple of minutes ago I referred to a couple of times was a nonprofit organization. Okay, so it's not coming out in dollar signs. It's success along the mission. But as you start to do this and start to understand that making data things happen and tying those things to organizational things that happen, you now have the opportunity to put two X's together to come up with a bigger dollar sign or perhaps multiple types of combinations that we can use so that the organization itself will end up seeing the significant things happen because right now without those purple lines on it, most of what we talk about are not terribly top of mind for most of the people who are affected by that. A couple little words on a framework here. If you're not familiar with the concept of a framework, it's just this system of ideas for guiding analysis. We talk about organizing data around projects and making priorities and things, but it's really very practical. If I have a framework that says don't put the walls up until the foundation inspection has been passed, that's really good advice because if we need to tear the foundation apart and rebuild it, the walls will have been put up for nothing. That's a perfect example of a hidden data factory that I talked about as well. Another piece that the framework can help to tell us is you need to put the roof on as quickly as possible because once you cover the interior, we can now start doing electrical things and all sorts of other things. Making all this funding dependent on continued funding is absolutely critical. Let's look at a framework for stewardship that I found out there on the network. I'm giving you the link down there in the bottom right-hand corner. Stewardship, personal mastery, a personal vision and mentoring, valuing diversity, shared vision, risk-taking, experimental, yes, these things are all part of stewardship. We need to have these in a context for a data stewardship. I like to look specifically at organizational data challenges leading to some sort of strategic consideration because one of the things we all have to contend with is that there's not enough time and money to do everything that we'd like to do. Some of these will be put into a bucket where we say we're going to address them later and some of them will say we need to put them and get our stewards working on it right now. Those will largely start out for many organizations as regulation and policy type things that lead to specific stewardship activities that include both a reactive and a proactive component in there. And as I mentioned before, some of this will be monetarily and some of it will be non-monetarily but you can still derive value from it. And if you understand how that's working, you'll understand how the leverage can help you derive greater and greater value from all of that. Let's look at the specific components, the roles that the stewards play in there. Again, I always start off with IT systems being foundational just as I was talking about in the framework there a little bit ago. But let's take it a little bit further and do the standard sort of four-quadrant thing. Now, first of all, the line that you're seeing that is vertical in there, the domain expertise is less to the left of that line and greater to the right of that sign in there. So we'll look at that. It's another component that goes into that as well on that same dimension though. The roles on the left-hand side are more formally defined. The roles on the right-hand side are less formally defined. They've already figured out it's a four-quadrant diagram so we need to add the other quadrants down here. On the top of the horizontal line, we will now encounter governed data assets less directly and on the bottom half of that quadrant, you will have people that encounter data assets that are governed more directly. And similarly, the people at the top will have less time dedicated to these. The people at the bottom will have more time dedicated to these. So that's the framework for this. And we're going to have some components here, which is a leadership component. There will be some data decision makers. But our real focus here is stewards, the data trustees in this particular process. And they will also be working with participants, experts, SMEs. We call them a lot, subject matter experts. And then of course there is everybody else. Now, many organizations will then draw a line around the left-hand side here and say this is our data governance council. Others will do it different ways. Again, we don't have a one correct way to do this because we've only been doing it for 20 years. But given if that's what we're going to set up for this, then the leadership is responsible for getting some resources in here. They will also of course hear about data feedback as they go through. And they will be making some decisions. And those decisions are things that the stewards then have to make. Uh-oh, we need to change something that's never a whole lot of fun, which means there's some action. And we're now going to start to do some things as stewards that impact the participants and everybody else who's involved in this. There may be some changes that need to occur in that, and that can be done as well. They're going to get some feedback, and the feedback will go back to the stewards here as well, giving them ideas, guidance, et cetera, et cetera around all of this. Again, I would not share this chart to your constituencies. I would make it simpler and just show it to them like this. Again, there's a couple of questions that are popping up about how people are going to be able to get the slides. Of course you are, that's the whole purpose of this. And probably some of you will improve on what we do, which is how these things have gotten to be as good as they have. I mentioned also the firehouse metaphor. In this context, the data stewards are the firemen. And the firemen, yes, they sit around at the house and wait for things to happen, but no they don't. In fact, what they really do at the fire station all the time is that they're doing education around fire prevention. They're making sure batteries are in things that are going out and talking about this to schools and things like that. And they're coming up with some real good data duct tape to things. So we've got some great MacGyver stories that we are able to tell on all of these things. But it is absolutely critical to make sure that people understand that they do this part-time. They fix things part-time, but they also do a lot of other work in that process. Again, we're trying to formalize the tribal knowledge that has been based around hand-known processes that are just not widely known. Give it to Peter, right? If Peter's the person there that knows how to do that. We also need to understand how the stewards transform that governance into action. So a really good question to ask your group of stewards is to say, hey, if we made this change, how would you guys go about the process of understanding that? To apply this framework to your tasks and to get good at both the reactive and proactive activities. And to make sure that we incorporate leadership outside of the traditional channels, because we do have a good common education mission here. The smarter our teams are, the more literacy that is exhibited by our organization, the better off we'll be. Because you cannot accomplish everything. It's just impossible. So the first question is, should we manage this data? No. The right question is, should we include this data item within the scope of our current management practices? And the reason that's so critical is because you will have. Gosh, I see. I've got mang instead of manage there off the change set before I send it out to you guys. On that, there we go. Bad there. All right. Anyway, let's, sorry, I'm going to go back on that to do that very well. Get down to the SDLC. What is it that we're looking for all of these stewards to be helping out on? And there's a couple of different things. Governance has to provide enough cover for these individuals. But at the same time, the individuals have to be clever and creative themselves. So I like to talk about making a better data governance sandwich. Again, a couple of you have mentioned literacy already in the chat thing, which is great. Yes. We have some very definite problems and challenges around data literacy because we haven't been teaching it to everybody. And I've got some very good data from something called the data literacy project that shows that only about one fifth of business people feel like they've got the ability to well manage data. In addition, we also have to talk about the data supply because there is a lot in the data supply and we need to standardize the data. So these are the three things, if you will. We're going to try to put together to make a sandwich. And as you can see, as your stewards are able to go through and do work, they should be able to transform this into an engineering type of problem because only with engineering can we actually put it together into the proper data sandwich that we'd like to have. And this cannot happen without engineering and architecture components on this. In fact, I went all the way to India on a trip a couple of years ago and just a picture, of course, that I took. And I found this wonderful little saying by Deming on the cash register at this T-form in India. Quality engineering and architecture work products do not happen accidentally. Very few things are truer than that, but that is absolutely true. And of course, we're talking specifically about data thingies as we go through all the rest of this. Now, understanding that we've talked about different types of queries that can work a billion times or even a million times in your organization. If you can find ways of having the stewards go in and start to work on these activities, then they will be able to make a more better working set of engineering and architecture constructs that you're working within this. It's a very, very critical piece. And again, we find out that most people just don't really get it. It turns out that all of your organizations are all about data. There's a bunch of data that's there. Somebody at the top makes a plan. The plan gets transmitted to middle management and then down to the worker level. And for some of your organizations, you actually make things, meaning that now the vast majority of organizations do not make real things, but they make data products. That's the information. The data factories are problematic as well. And it's important to take a step back and to say, what is the business that we are in? Because if we don't have a real good idea of how that works, it's going to be problematic in terms of what we're asking our stewards to do. See, data is not a project. It cannot work as a project. In fact, it works as well as it has now as a project. And this is one of the more important breakthroughs we've made here as well in data professions to do this. So your durable assets are assets that have a useful life of more than one year. See, there's been a couple of quotes out there about the chat function about Agile here. So let's talk a little bit about Agile. Agile is the best way of developing higher quality software faster that we have been able to come up with. Remember, we've only been doing coding for 100 years in here. So we can't in any way claim that we're at the best, but we can claim that Agile is the best way we have come up with to create higher quality software faster. And it's reasonable to look at deliverables in that timeframe of two-week Agile sprints, perhaps, or something like that. But data evolution is measured in years. And the reason that's so important for us is because I have gone to these organizations. There's a certain company in the Midwest where I have flown in there three different times in my career and done the exact same project for them three different times. I guess they like what I do. I keep telling them they should only have to have done that project once, but for them, that's the way they work. Excuse me, that they have the same data that they've been dealing with for the entire 30-year period. Data evolves. It's rarely created. It does show up. We can acquire new sources of it, but as a component from an architectural perspective, it is significantly more stable. And ready-made architectural components are a prerequisite to Agile development, which is a great role for a steward. So if a steward is looking at a development project, and that's not what most stewards do, but I contend that they should be, the stewards may be able to come into that process and say, hey, you're working at Agile. That's great. We understand how all that works. And here's some things we can do. Show me what is the status of your specific data requirements? Are they well-formed? Are they stable? Have we reached a point where we are certain that they are done? Because if we don't have the data components well laid out, well architected, then there's no point in spending any money on Agile. It's just pouring money down the drain. The only alternative when you're in the middle of an Agile sprint, if you don't stop working on that Agile sprint and go off to another one, and by the way, there are plenty of other ones. So this is not anything that will hurt progress in Agile. But if you have data requirements that are unmet in an Agile sprint, the only alternative is to create additional data silos. Now the good news is everybody on this call we're giving you guaranteed employment for life. But that's probably not exactly what you're looking for. And again, this is cleaning up messes rather than actually using data in a very proactive sense. So let me tell you a little bit about how the strategy component of this works. If I like to look at George Box's famous saying, all models are wrong. George Box is a famous statistician. All models are wrong. Doesn't sound like something a statistician would say unless you read the rest of his statement here, which is that all models are wrong, but some are useful. And by the way, we're finding this out with the COVID models. We've got a section of the book where we talk specifically about the COVID models that are problematic around this. So given that these constructs that I'm going to give you here are probably not correct, but hopefully there are enough to get a dialogue started on this. Many of you may remember reading the goal at some point in your careers. It's a interesting book. I was actually told to read it by my spouse again who said, I won't talk to you about anything in business because we don't have the same vocabulary. And until we can come up with that vocabulary, we're not going to have any business conversations. You can imagine the awkward silences around the dinner table on this. And I, of course, correctly grabbed the book and read it. Alex Rogo works for Unico. He's trying as hard as he can to do things, but he can't make them work because everybody's managing to the wrong objectives. And the theory of constraints says there's always one constraint that is forcing the rest of the organization to slow down. I'm showing that in the bottom left-hand corner there. The individual who's marked at number 10 there is clearly blocking work that needs to get to number 14 and number 18, but they don't know what's wrong and they don't know ability to help out number 10 up there as well. So the theory of constraints says find that one thing that is blocking you, the least strong link in the chain, the weak link in the chain, and fix it. And then go back and do it again and again and again. And it actually works quite well. Let me take you to another concept around all of this. When we see organizations doing analytics, right, they've got some data stuff that they do. So on the left-hand side, we'll call it a black box at this point. But the analytics practices then allow us to create some marks and warehouse data and visualizations and screens and dashboards and all sorts of other things that come up on this, and this is great. This is exactly what we want people to do. These are how we want stewards to help use this in an organizational context. The problem is all of the data on the right-hand side of this diagram is duplicated. Now we've got more data than we need to have. And more importantly from an organizational perspective, what I see happen is that whenever there's any feedback, they want to fix it in the data warehousing environment. When it's likely that those fixes could be done over on the left-hand side of your screen in the data management black box and prevent significant amounts of rework work and other things that occur. So while you're building all of this out in here, it's very, very useful to understand that most people don't have any idea what's happening on the left-hand side of this diagram. And that's really the area that we should be putting our efforts into. Let me give it to you from another perspective as well. Organizations, again, oftentimes start out without a formal steward component in there, but then they take their first stewards and they say, hey, I've been listening. I understand a little bit about strategy and in strategy there are only two plays. One, improve what you have to make something new. That's pretty straightforward, but for some reason it took a long time for us to figure that out. So we don't want to not have stewards. We want stewards. Stewards are important to do this. If we look at this vector number two that I'm showing you on the screen here, and let's just say that Walmart is a company that is known for its effectiveness and its efficiencies. That's a terrific niche for them to be. And if they're going to use their data stewards to help, and by the way, you should reward your students for helping to improve the efficiency and effectiveness components in there. That's a very good play. If we go to the other quadrant here, number three, vector number three, and pretend that Apple is a company that actually invents lots of new stuff and creates these new opportunities here, I just want you to see the fallacy of trying to do both at once. Because we've asked our stewards to do this in many organizations for a number of years, and it's just, again, wrong. It's too much to have people doing this, especially when only one in 10 organizations even has a strategy to come off and make this happen. So I want you to imagine here, if you will, Johnny Ive, who is the Erudite British fellow that used to do the introductions to all the new iPhone or iMac or whatever it is that they're producing this thing, and he talks lovingly about it, and it's got this beautiful piece of machinery floating around on the videos that you see on the Apple intros that are out there. And I want you to imagine Tim Cook telling Johnny Ive, you need to be cheap. We're not going to do a V3 anymore. We're going to go down to V2. We're not going to have DNA to do that. It's not that kind of a company. It's against their culture. And I want you to equally imagine you're one of the Walmart experts who are really good at squeezing the last penny out of everything that goes on inside of Walmart and telling them to be innovative. That's not what they're about. So even this, the data stewardship activity should be sequenced, and what we'd really like to see is something with efficiency and effectiveness, because you can use that money that you save, very significant, tangible dollars that you will be able to attribute to your data stewardship program, and that can be used to put in towards the innovations. So again, a different use of the stewards here. Let's talk one more little time about Agile here just to make sure we drive the point home on all of this because most people say, well, Agile will allow me to do both of those things at once. It doesn't work that way, and that's not the way it's intended to work. So the joke here is this is Agile surgery because the patient says, wait, you're going to perform surgery without putting me under? Yes, this is Agile surgery. We need to ask you about your symptoms and complaints after we open you up. Ooh, doesn't sound like much fun. And we'll also need to know what you want us to work on in this first go round. Let's take another component for this. Again, many organizations feel that a data strategy should sort of fit into this hierarchy here of an organizational strategy and IT strategy and a data strategy. Again, Morgan Freeman, thank you very much. This is wrong is what he says, right? He does that so much better than I do. So if that's not right, how should we do it? Well, we should really look specifically around this that says that data strategy is equal to and in some ways superior to the IT strategy, but it's definitely not subordinate to it because if we tell what our data strategy is going to be coming up, that will influence the types of software and information technology that we pull into play. And yes, of course, there's a backlink there too. We don't want to miss any part to that that would be super in order to do it. We've been working on this for a while and in the data world, I actually ripped off the Agile manifesto. It says we're uncovering better ways of developing IT systems by doing it and by helping others do it throughout this work we have come to value and then I change it to my stuff. Data programs preceding software development and stable data structures, shared data structures and data reuse preceding all of those codes. And it's not to say that we don't think that the things on the right are valuable, but the things on the left we value more in order to do this. It's very, very critical to make sure that we can have it that way Unfortunately, what happens with most organizations is that they perceive data as being this little tiny thing here in between IT and business. I call that the bat sign, if you will. And they figure if they could just get a handle on that stuff in the middle and of course when you start messing with data that's when you discover it's really not as small and confined as you'd like. In fact, really if we're going to put it in proper perspective, that's the way of the world with data. Take a look at it. There's a whole of wrap up things here at the top and I can see there's a lot of good chatter in the chat. Let's get to those questions and answers here. So again, from a stewardship perspective, why? We need to have an understanding that stewardship is absolutely critical because there's a lot of confusion around it and there has not been a good educational focus. The lack of strategy is one of the things that will cripple your stewardship programs faster than anything else. On the how part of it, it's understanding that there's a part of normal governance that should occur with everybody and that the fire station model allows us to be partly reactive and partly proactive as we're moving forward. And then we have to understand that if we are building new things or fixing things that already exist, those are parts of the SDLC and this is important for everybody to understand on all this. So the need for data stewardship is increasing because of the increased data volume and the lack of practice improvement around that. It's a very new discipline so we're going to conform to constraints. There is no one best way and this is why you all are the experts at what's going to work in your organization. That data strategy must be driven by, excuse me, data stewardship, must be driven by a strategy that complements the organizational strategy and that comparing data frameworks can be useful in this. They direct the specific expertise that we have and that we need to have in there so that the language of data stewardship is metadata and that process improvement can help to improve these data strategy practices. Now, I finished this out with just a little bit. This is a fun thing. I'm going to turn the volume up here. I don't even remember who did this but this is to the hotel California so you can hear the hotel California in the background. If you want to get a copy of this, there's a Google, I think, governance data, and I'll figure out what it is. Anyway, there are a bunch of things here that we do want you to think about. Not getting commitment from the business in IT that this is a program that you start off too fast that you try to solve all your data problems by Friday, it's not going to happen that deciding that not too much or not too little is not right so therefore the answer must be in the middle. No. Committee overload failure to implement not dealing with change management alone will answer the question not building sustainable organizational practices and ignoring the IT systems that are strategy systems out there are big problematic pieces. Now, I don't normally recommend books here but I do think that one good one to take a look at is this from John Ladley who's done a great job of talking about data governance programs in general on this and I'm going to finish off here with something that even though I'm going to put the upcoming events here to design some sort of controversy here I do not believe that there should be any data owners in the organizations except for the organization itself and so the stewards are primarily responsible for ensuring that the data is preserved so that it can help the organization in its best fashion because if we don't have if we have data owners they tend to fight and that doesn't work out so well we can do a lot more of that as we get into the Q&A session there is a special sale going on right now at the bookstore out there at plus anythingawesome.com you get 20% off some of these books that are kind of relevant on there but now we're approaching the top of the hour and our favorite part, Shannon and I get to listen to your questions and you guys tell us what we really should be doing in there so Shannon we'll turn it back over to you. Peter, thank you so much for this great presentation as always if you have questions for Peter feel free to submit them in the bottom right hand corner in the Q&A portion of your screen and just answer the most commonly asked questions just a reminder I will be sending a follow-up email to all registrants by end of day Thursday for this webinar with links to the slides and links to the recording as well as anything else requested throughout so Peter how imperative is it to pin performance metrics to the stewardship roles is this task 25% of their job they are probably doing something else as well yeah so let's talk about the something else first of all and I'll come back to the question I'm offered a lot when I consult with organizations where they'll say you can have 10% of 10 people and I so much prefer to have 100% of one person because as you can see this discipline is all over the place and asking people to do things at 10% time involves huge huge switching costs so we don't want to get into the fact that people have to you know pull their heads up out of whatever they're doing and do something stewardship wise and go back to it I will always take full time over part time in almost every case the specific question was on giving them specific performance measures I'm looking for my quadrant diagram here so I can pull it back up hang on I'll just do it you guys can see me do this right click there click there out of that there we go pull these things up and find that slide that I'm looking for let's talk about why those people should be accountable because that's I think really what the questioner was getting at so hang on here are my players there we go simple diagram everybody can see it who in this chart if not the stewards are going to be accountable are we going to make the subject matter experts accountable I don't think so they really are known as subject matter experts and we want them absolutely to continue to do the things that they're supposed to do same thing for the other data and consumer data makers and consumers they're not going to get the benefit of stewardship training so it's going to be a challenge around all of those are leaders going to be accountable absolutely they should but you know what if you give it to the stewards they'll do it now let me add one other piece to this too we tend to teach the process of testing software incorrectly stewards are kind of like that kind of a role if you could forget to test software somebody says hey play with that software for a while and see if you can see if it works out works fine right well there's no information exchanging in that exchange fine fine you know it's again like the the stuff that the people you know how is your weekend just fine if you sit down and give them a detailed dissertation of how you spent Labor Day weekend they probably don't want to hear that so these stewards have to be the people who are responsible for it and if we do it if we give them the responsibility and make them accountable for it then they will help us in the process of putting this map together in a way that works well for everybody because as I said several times here we're making this stuff up and that process of figuring it out is why we gather on these webinars and why we go to places like data university so that we understand the entire process and can learn from what everybody else is working on in this so again a great question I hope I answered that for you if not the stewards who are going to be responsible and so that's what I like to do is to make the stewards responsible for it great question thank you for that so Peter I agree that ownership is a problem but how do you propose effectively governing without owners ah well the role of a steward is in fact to protect something on behalf of the owner let's go back to our definition of stewards up here at the top just because a steward doesn't own something or a steward doesn't work for a data owner doesn't mean they can't keep the best in mind in terms of what everybody's trying to do so again here's our definition of stewards one who is actively directs in this case the use of organizational data assets in support of specific mission objectives if we have the stewards who are doing this and again let me tell just a brief story here to illustrate this point it's not part of the presentation but I've told it enough times hopefully won't put anybody else to sleep one of the projects that I worked on for the military was something called the DOD Suicide Mitigation Project and that project involves specifically trying to make sure we figure out what was going on with our service members who were dying more from their own hands than the hands of bad guys that's a noble cause it's something that we should do it was done during back at the Obama administration and I had a challenge when I was working on this because I had a lot of data stewards who would come into me and they would give me a dissertation that would say you can use my data under this particular set of circumstances for these things and somebody else would come in and say it's coming from recruiting it's coming from their operational pieces it's coming from their exit interviews there's all sorts of different types of data and each of them has their own rules and each steward was correctly working within that particular process I had a favor that I could ask the secretary of the army for and the secretary of the army came into one of the meetings and said ladies and gentlemen I have an announcement to make but it fits down pretty hard on the table we are going to call this my data from this going onwards and anybody that wants to tell me why they can't use my data to save my soldiers lives my office door is open this was the key to getting productivity I have been in organizations where I have had somebody who owns some data we'll call it the finance data and people looked at the finance data and said it's not optimized for the way I like to have it and the person who the quote owner of that data said if you don't like my data then you will not have access to it and we literally had to wait three years for that individual to retire data ownership is the most harmful concept within data stewardship I'm very very strong about this nobody in the organization should be allowed to own any data they however should be charged with stewarding it and making sure that it is stewarded properly I may get some pushback on that but that is absolutely my position on that and I can tell you I've run into so many problems on the other side of that that are just eliminated entirely oh, I'm sorry I almost forgot to tell you the punch line of the story with the suicide project the secretary of the army told me I could write that up so the middle book there the monetizing book has that story in it but most importantly having that story in there should in his mind be the inspiration for other corporate leaders and yet I have told that story to well over a hundred corporate leaders and not a single one of them will take the courageous step that the secretary of the army has taken maybe some of you have them and if so these are things we'd like to hear about but we still see way way too much activity around deciding ownership of data which is really a shared research I'm just thinking about this from an accounting perspective does accounting own any data? No, it is all sent into them from somewhere else data ownership is the most harmful concept around this that I have had to deal with I see some other people are chiming in on the the call as well, thanks for that Indeed, so Peter, you mentioned that a company should not have data owners if you don't would it cause a lack of accountability? So the data belongs to the organization if you're working for a non-profit organization the data still belongs to the organization if you're working for the army the data works for the army in fact I know one of the data stewards for the army the past couple years eliminated the data sharing agreement saying that there was no room for them under statute or policy and so we're all army we all ought to be able to interchange data together we're working with a number of states where we're trying to eliminate the same kinds of things the Commonwealth of Virginia has put in place a construct called the data trust so once you donate data to the data trust it's done with a certain set of rules and guidelines absolutely somebody who owns something they then get to think they make the decisions and we don't want somebody that is downstream making decisions about data that is upstream that would be harmful to the organization similarly we don't want somebody upstream making bad decisions about data downstream I'll tell one more quick story in there before we jump back into your questions but the hospital in the Midwest that was going to go off and do a bunch of knee surgery and the reason they were going to do knee surgery was because the director of the hospital had looked at the admissions data and seen that knee surgery was the highest category for which people were admitted at the hospital unfortunately the hospital director did not know that the people who were servicing people on the way in the customer service representatives that were checking people in had been told to optimize for speed rather than quality so it turned out that knee surgery was the default hospital admission code and here's a really good example of a person who's managing well with data but the data is garbage and so consequently we get the garbage in garbage out phenomena that we talked about earlier and again great question but let's definitely try to keep it not the ownership is with the organization not with any individual in the organization because individuals come and go and Peter is there a data stewardship model that you think works best when somebody says is there a model that works best I do come back to my charts on this and say that I don't know that anybody can specify without knowing your organization what is best for your organization and I wouldn't want to do that you are the best person who can describe that but I can tell you that it will be a mixture of some things that go into it and question comes up and I'm going to go ahead and build this chart up here it just takes a minute but the question comes up as people are trying to do this how much time do we want to spend letting things get better because we've now put policies and procedures in place and for some organizations that may be perfectly good next year is good enough but some part of your organization is going to be impatient and is going to want things that move faster so get this type of a construct here there's going to be some division between policy based improvements that you're going to make to the data and active things that you do in order to have that and as I said before the more you can make people understand that good things happening in data they now know what the impact is in the organization that's where you really have made a breakthrough in order to do this so these are the players but absolutely your best qualifying your organization and just by the way should change you have a mess that you need to clean up and so you need to do things as a result of a very great focus for a little while and then you can revert back to the snails pace on the other side and again that snail is a wrong way to think of it I'm just using this for extremes here right hopefully that helps good question thanks you define different levels of stewardship for example executive data stores data domain stores and to be perfectly correct on that I didn't define them but my colleague Dave Plotkin did very good definitions is there a question there Shannon that was the question so do you define different levels of data stores good question alright so again I absolutely recommend David's book and it's wonderful but if you're just getting started in this business or restarting this this is not the place to start from this is too much detail it's it's too much to think about a subject that you don't know much about that we don't know much about in order to do this so rather than starting with this and trying to specify the difference between a project data steward and a domain data steward up front a priority let's instead just put some people in who have stewardship roles and as those roles mature perhaps two or three years down the line then we can come back and look at differentiating perhaps between in this case one type of data steward and another type of data steward on here but trying to start out in advance just in my experience leads to a bunch of arguments and not helpful so a dirty little secret guys but you'll you'll get this sooner or later because it's going to come out in the next book or two I'm even not really wanting executive management to hear about anything other than a data program I mean imagine you go up to the boss and you're asking for some you know resources to do some of the things that we're talking about here in the buses while I gave you money for data governance doesn't that cover your data stewardship piece and you know the answer may be yes maybe no depending on how much you've been able to acquire from this it is absolutely critical that we just get them to understand data program period if they can get that after three years then you can start to talk about okay well there's some components to this program and then break this down into more detail et cetera et cetera but at first let's just keep it simple and say let's just get some people whose job it is to manage the data with guidance if we can get them to manage the data with guidance for a couple of years then we can get fancier and then say okay now do we need a metadata steward or do we need a legacy data steward by the way I don't even like the word legacy in there because I define anything that's in production as legacy and so it makes it a little bit difficult to look in there and say oh what language was it written in what year was it done and did the person is the person still alive who wrote it you know we're encountering more and more these days anyway great question I hope that answers it Peter this is more of a statement than a question but it says you know you're definitely right about stewards rather than owners but you have not emphasized strongly enough that the business needs to own clear explicit requirements for data thank you for suggesting that whoever did it's a great point yes it is not our job as stewards to tell the the business what they are supposed to do it is our job to help them do it better and so our job is to understand what their needs are what can a stewardship operation offer for these organizations and the answer should be they have a data problem of some form and we need to correct that data problem in a way that sticks with it and the stewards are likely to be part of the solutions team that is being put together to address that so thank you for the re-emphasis and what separates a data steward from a business analyst that's a really good question probably HR which is a very smarty answer and I apologize for it one of the things we do as people who've seen this a lot in the space when we go into an organization we'll actually go visit HR and say what terms do you have what do you use to define your data jobs and if they have just one we understand that they're relatively at the beginning of their organizational data journey so if we look then at what the organization is doing from a jobs perspective they're not going to add more job categories until they've figured out that these things are useful and that they can use them in their organization so that's really the key there is to try to figure that out if they don't have these data positions then I'll say well just give me a couple of business analysts because I can train the business analysts in the data very very well so the overlap between the two sets of skills are very complimentary I would say that a data steward should have data analyst capabilities but also should have data knowledge that are not likely to be part of the data analyst just right out of college and university but an experienced one would certainly have picked up an awful lot of this terminology again great question thank you for that I'm searching through so looking for some more questions here if you have questions again feel free to submit them in the Q&A portion of your screen for even a good data joke we'll take that right I would love a good data joke I'm always up for a joke a good laugh so just trying to navigate through the chat here I'm looking for a little couple things oh so data steward versus analyst which we talked about already versus me but subject manager subject matter expert versus application dev versus product owner versus product manager it's a lot of verses and you know how are you how do you distinguish and are they one in the same sometimes and different organizations are going to be made up of different groups and will have their own individual histories that they pull together to do this so the real key is to get these concepts and so while I'm not at all rigid about the terminology on these there are clearly going to be people in a decision making capacity upper left hand corner there are clearly going to be other people who are not really involved in this other than it impacts some upstream or downstream that's the upper right hand quadrant that you're going to have some subject matter experts who are going to be more expert in things but not necessarily in understanding the role of stewards there and then of course we have the bottom left hand corner of the stewards so yes absolutely a data analyst that would be great if you've got too many cooks in there pulling the broths right that can be a real problem on this and there will be very often times that stewards will act as others in their including leadership capacity so I've got a couple of organizations that have decided that they don't really want stewards but that they will have people put on the steward hat to make certain decisions so they're just not a very big organization they looked at this quadrant and said now that's half our IT staff right there you know we can't even break it up that small again it's going to be different for every organization but the key is what are you attempting to accomplish and what a steward does is that they help implement the decisions that the data leaders make about what we should be doing in our data program so that we can make up for the past neglect and then put this stuff productively to work because at least for the moment data lake projects have an 85% failure rate I think was the Gartner quote that I was quoted the other day and other types of big data projects and things are achieving the same success as most IT projects about 30% of them succeed on time with the full functionality that was originally specified for the price that they were originally agreed on to do it's a very very challenging environment to work in out there I just lost my question so what's the role of the user regarding data literacy and enabling the data driven mindset fantastic question and the subject of my next book I'm calling the next book danger do not feed the data matrix because if you are a passive participant in that process you will be feeding the data matrix and if you are at least somewhat aware you will try to avoid doing that given everything that's going on in the world we've got an awful lot of things that need to happen and yes we need to do more with our data but we've got to start somewhere and starting is to say trustee's data trustee's and these data trustee's are going to be able to help us as an organization focus in on the data aspects of the decision making processes that we have to have remember my definition of data governance is managing data decisions with guidance in there and those decisions have to be done by somebody who's knowledgeable in order to do this so we look at what really is going on in the stewardship role the role of stewards specifically talk about let me get this slide up here there we go person who looks after arrangements blah blah blah and stewarding is an officially recognized process of managing or looking over so a data stewards is going to have a very specific role there but the role of trust and the concept of fiduciarity are going to come into this as well wouldn't necessarily be present in the standard training for a data a business systems analyst or something but the stewards are entrusted with the data in a fiduciary relationship if my lawyer tells somebody something that they weren't supposed to tell them I have an action against my lawyer that would be something that was a crime if you will civil crime but something that individual could be penalized for and could potentially lose their ability to practice law we're not that far in the data stewardship area but we do have this concept of trust and fiduciaryness somebody is being trusted to make decisions about on behalf of with regard to the relationship in this case with the trustee and the beneficiary of course the beneficiary is the person whose data we're storing so it's a relatively new type of profession when you compare it to something like accounting that has been around for literally thousands and thousands of years and is there a connection between data stewards and data makers data stewards and data makers well there are lots of people who contribute data including you know teenagers who are just running around again the way I would do that is go back to this other diagram we've gone to a couple different times here so far and when you say data makers I would look at you know other people who use data as being the upper right hand quadrant of this data makers and data consumers and they're definitely not stewards because they haven't been told that they have been entrusted with this fiduciary relationship that's not to say that they couldn't be and there's nothing wrong with having lots of stewards but you don't want them tripping over each other and you certainly don't want them trying to figure out what each one does so keep it simple at first a data program and a data program will have some people in it whose job it is to manage parts of the data from an organizational first perspective as opposed to a department or an application first perspective and there's a question that came in here in the chat Peter but I know we get this a lot so I want to make sure and address it so without owners how does budget get assigned to support stewardship do organizations have management dollars at very high level in the org a very perceptive question right if nobody has the owner then who owns the budget for it and this is really the key data of course is a shared resource all the way around so this lack of budget is what forces the execs to get together the data decision makers to say look I know we could improve this data in this way but if I do that I'll be taking money from the training budget and our sales people won't be as trained again if that's the either or decision that they're having to make so so by using that specific no data owner therefore they're going to have to come to agreement and therefore the conversation has to occur and that's what you want to have happen is these types of conversations occurring on all of these activities here it's a turkey thing but think about it for the moment has it worked well having data owners so far and the answer is generally no if you want to hear more of those horror stories catch me offline and I'll be glad to tell you some of them because there have been some stories there have been some stories no doubt Peter thank you so much for this great presentation that is all the questions that we have and then we are close to the end of the half hour here just a reminder as always I will send the follow-up email to all registrants by end of day Thursday with links to the slides links to the recording of this presentation thanks to everybody for being so engaged in everything we do just love that some great questions that came in today and I hope everyone has a great day and stay safe out there thanks Peter thanks all absolutely thank you Shannon thank you everybody for listening and we'll talk to you next month when we talk about metadata cheers