 Hello and welcome. My name is Shannon Kemp and I'm the executive editor of Data Diversity. We'd like to thank you for joining today's Data Diversity webinar, Best Practices in Data Stewardship. It is the latest installment in a monthly series called Data Ed Online with Dr. Peter Akin, brought to you in partnership with Data Blueprint. Just a couple of points to get us started. Due to the large number of people that attend these sessions, you will be muted during the webinar. If you'd like to chat with us or with each other, we certainly encourage you to do so. Just click the chat icon in the upper right hand corner for that feature. For questions, we'll be collecting them by the Q&A in the bottom right hand corner of your screen. Or if you'd like to tweet, we encourage you to share highlights or questions by Twitter using hashtag data ed. To answer the most commonly asked questions, as always, we will send a follow-up email to all registrants within two business days containing links to the slides. And yes, we are recording and will likewise send a link to the recording of this session as well as any additional information requested throughout the webinar. Now let me introduce to you our speakers for today, Peter Akin, is an internationally recognized data management thought leader. Many of you already know him or have seen him at conferences worldwide, especially data diversity conferences. He has more than 30 years of experience and has received many awards for its outstanding contributions to the profession. Peter is also the founding director of Data Blueprint. He has written dozens of articles and books. The most recent is Monetizing Data Management, although that's about to change. Peter is experienced with more than 500 data management practices in 20 countries and consistently named as a top data management expert. Some of the most important and largest organizations in the world have sought out his and Data Blueprint's expertise. Peter has spent multi-year immersions with groups that's diverse as the U.S. Department of Defense, Deutsche Bank, Kandokia, Wells Fargo, the Commonwealth of Virginia, and Walmart. He often appears at conferences that is constantly traveling. Peter, where are you today? We're actually at Data Blueprint World Headquarters in Richmond, Virginia. A beautiful day here on the East Coast. Hopefully the voting will be seamless and effortless for everybody. And hopefully you're not listening to this at the expense of voting for something important like the President of the United States. But aside from that, yes, we're here. And I'm here with my colleague Mike Ogilvy. Mike has got 15 years in the business here. Very deep architectural experience in warehouse and integration and data quality. And we were just talking before we got started about how that is actually, how much of that has to be education into the customer base. And that's one of the things we all find out in this area is that if you can explain a problem, you probably have way to solving the problem. So Mike, great to have you here. Mike has a BS in physics and a focus in here, specifically on data governance, stewardship, quality, and requirement consulting. And we're going to talk for the next hour or so about best practices in data stewardship. And the real key to this is that... Did you come up with that line? With great data comes great responsibility. Maybe that was Dylan that came up. That wasn't me. No, that wasn't someone else. We know it's a group effort. So we never find out where you think it's come from. But we like it because it is a set of responsibilities here that we're going to look at. So we're going to look specifically at, as we always do, a data management overview. We'll talk about the business needs for stewardship. We'll talk about some principles that are involved in it. And then we'll look at how to pull those together to give everybody that's participating some opportunities to foster a data-driven culture. And Shannon was hinting at the data strategy book that's partly coming out of, and we're partly pulling some of the next seminar out of this one, too, just to give you a preview on that and the next one is called Exercising the Seven Deadly Datas and we probably should have done that around Halloween, but hey, what the heck? Exactly. Writing schedules never match up with everything that you'd like to have them done. So I'll take the first chunk of this and just sort of give a quick run-through and background for everybody on what is data management. Most people sort of define data management or what happens between the two important points in a piece of data's life. That is when it is first acquired and then what is used. We come up with some definitions like understanding the current and future data needs, blah, blah, blah. What it does come down to, though, is in order to have data that is sourced and used, we have to do a couple of things. We have to engineer the data. We have to store the data. We have to deliver the data. And what that means is we need a process that we call governance and we'll get to a little bit of rationale on why governance is important in just a bit, but this also means that you need to have specialized team skills in effect because if people don't know what they're doing, you don't tend to get very good results on all of this. There's a problem, however. We actually put this out there as a challenge to the community each and every time we do this because this doesn't well represent the real value of data that it needs not just to be used but to be reused. And so we've been moving towards more of a definition that looks a little bit like this, which is to say that the reused part is the part that we're going to focus on. The idea that data is not a production function and we're going to write an article in the near future called Data is Not the New Oil on this. And to say the specialized team skills should really be focusing on delivering our sole non-depletable, non-degradable, durable strategic asset to a variety of business needs and then examining how those business needs are met by the data so that we can improve the overall business practice. It is definitely not a strict storage skill. The real challenge, though, and again, Andrea and I were just talking about this right before, is that most people don't kind of get how this works. And I put that blame directly on the college and university system, which tends to focus in on the fun stuff. Now, the fun stuff will define as roughly the stuff that's in the golden triangle here. And of course, you've all heard this, master data management, data mining, big data analytics. These are all great buzzwords that turn into be largely technology-focused. And I say that because the top of this pyramid here is very much like the tip of the iceberg or the top of the Maslow hierarchy of needs that you can see in the upper right-hand corner there. If you remember Maslow, Maslow says that if your food, clothing, and shelter needs are unmet, you cannot move to the next level in this case. Again, you can see it's safety and then belonging, esteem, and then self-actualization of the five levels of that. And if your food, clothing, and shelter needs are unmet, you are unlikely to ever achieve self-esteem. In fact, according to Maslow, you simply cannot do it. And we believe the same thing here. If you want to try these things in the golden triangle, you'll do a much better job if you first learn the basics. And again, that may be the idea that you're going to have to understand governance, quality, data management, energy operations, platform and architecture at a good level before you can try to do the things that are really fun in order to do this. Notice again that the top is a focus on technology and the bottom half is really focused on capabilities. One other thing that's important for this diagram is that the foundation of any foundational piece is only as strong as the weakest link in this instance. So if I had made a foundation out of marshmallow, I probably wouldn't want to make it 10 stories high. It's just not going to work. And of course, I'm using an extreme example there just to drive the point home. But knowing that the foundation is only as strong as the weakest link is actually kind of key because in the instance that I'm showing here, the weak link is clearly between data management strategy and data platform and architecture. And data platform and architecture needs to be made stronger part of that foundation because you could put more effort into governance, quality, strategy and operations and it won't help the overall strengths of that platform because the weak link is over on the data architecture platform. At Data Blueprint we get a lot of questions and people say, yeah, I hear you say that, Peter, but I still need it done by Friday. Can you hear it by Friday? And the answer is yes, absolutely. But if we do it, it will take longer, it will cost more, it will deliver less and it will present greater risk to the organization that if instead you learn to crawl, walk and run, watch this next transition everybody. See, all those pieces actually go up into this thing now which is something that my colleague Melanie Mecca and I have been talking about for a couple of years. It's a relatively new development. It's a CMM for data management. If that doesn't mean anything to you, it's a capability maturity model. But the nice thing is that your boss knows CMM. We've done a great job and Carnegie Mellon has done a great job of propagating their process improvement process. And this process improvement process is now applicable to the data world here as well. So those five pieces that I showed you in the previous slide that were foundational now mean that we should manage the data coherently. Most organizations' data is managed very well at the work group level, but not so well when you try to get all the work groups to move in the same direction at the same time. That's the definition of strategy. That governance, there is a professional class of individuals as you'll see from the presentation today that we can now call upon to do good professional quality work that we can determine whether data is in fact fit for use, whether we are approaching data with the right technology and the right set of processes around it. These make big differences and we can now start to work within that context and see it. So what you see here between the iceberg slide and this one is a way of improving the foundational practices around it. And one last piece of sort of background here is the data management body of knowledge. Somebody with a marketing background, please help us. We keep naming things badly. We named this after the PIMBOC, the Project Institute Management Body of Knowledge. They did a great job on it. And 20 years ago, somebody may or may not have known what a project manager is, but everybody, thanks to their efforts, now knows if you are a project manager, you probably have to have the certification in place. And they call this the PIMBOC, so we called ours the DIMBOC. Well, okay. Anyway, the point here is that we've, for the first time, and I say the first time, all these things have come about since the year 2009. So it's relatively new. But that we now can look and see what it means to be data management. Each of these 10 pieces of the pie that we have, although one circled at the center is data governance, which we're going to focus in on today. And that each of these need to be done in a way that we can improve the process. Before we leave the DIMBOC here, we're going to look at just the overview, input, process, output, diagram, the IPO diagram for the DIMBOC. This is from page one of it. But what you see here is that these are the inputs that go into governance. Of course, we're going to dive into a subsection of this today. Let's go ahead and do it. Business needs for data stewards. And again, Mike, I think this is where you jump right in. Oh, that sounds good. So the question becomes, what happens with the data and why is it a challenge? Why does data get into an unmanaged state? I think the unmanageability is not just something that happens. Sometimes I think a lot of people, particularly some clients we've dealt with, feel like it's an oddity that they've gotten into this state. And it's really what I consider to be tantamount to a fundamental law of nature. That's just the way it happens. As an organization grows, the data becomes unmanaged and difficult because attention is being paid elsewhere. There are probably a few rare cases where people have done early adoption of data management strategies early in the lifecycle. But the 99% of the time, the need and foresight for a unified organizational outlook about data is only dealt with once pain points have come in. So data is naturally goes into an unmanaged state. So in the current state, you have little ability to communicate about the data. There's unknowable documentation and data inconsistencies. Where you're trying to get to is a managed state of data. In an ideal state, the goal you're going for is a unified organization of data. Data is an asset with known ownership. The names and language of the data is consistent so everyone is on the same page. And the data and metadata are meaningful across the organization. There's good uses of the data cannot be achieved if the data doesn't mean anything to anyone. Data has to work at that most granular level in order to be correct, which is the most detailed. And that turns out to be the hardest. And a lot of people don't understand that that is the hardest part of the problem. Right. Data is a lot. The reality is that there's a lot of it and it's not easy to manage, but it is an asset. And I believe Peter always says it best. I can't always replicate it as good as he can, but it's the organization's most durable asset. Durable asset. And it has to be maintained. So you need data governance around it. So you look at governance and we have a lot of definitions about governance that has been involving over the past couple of decades, really. We used to call this data administration if you go back old enough, it seemed like a natural thing. We started calling it data governance more recently. And there's a couple of definitions up here from Dama and a couple of colleagues. First one is the exercise of authority control to share decision-making, planning, monitoring, and enforcement over the management of data assets. Okay. That's a reasonable one. I have to say it's a good one because it's Dama and I was president of Dama for a couple of years. So yeah. Our good friend Rob Siner from which we grabbed a couple of key definitions in this presentation. And Rob is the editor of TDAN, which is also out on the Data Diversity website there, calls it the exercise and enforcement of decision-making authority over the management of data assets and the performance of data functions. Then my colleague Steve Adler, who we were both at a White House function just a couple of weeks back, we still joke about a debate we had a long time ago about whether data is an asset or not on there. But he says it's coordinating communication to achieve common goals among collaboration. Mike, was this your definition here? That was not mine either, but I like that one. We don't really battle over data instead of kids. All right, we're kidding on that one. But we don't want people to actually think that it's a esoteric thing. And really, the definition is going to be personal to your organization. The simplest definition, the one that we like the best, is that we talk about data as being managing data with guidance. And when you say that, the real aha moment comes from management who says, oh, so if we're not doing data governance, we are managing data without guidance. And that's exactly where you started the story, isn't it, Mike? Right, that's where you usually, you've already been. That's why you've gotten to the pain point where you're starting to contender governance. Ideally, you will think about it ahead of time, but that's not the majority of the cases. So in practical terms, you threw me on that one. I thought I'd got you out of the script there. Right, you got me a little less separate order than I was expecting it to. Data governance is defining how data should be managed within an organization. And this includes the designated people who are responsible for the stewardship of the data. And the governance rules that have been defined that includes a lot of things like data capturing, data quality, policies and procedures, and management of the process. Sorry about mixing those slides up on your area. The key for that is to understand that if we're doing it without guidance, things happen the way individuals think is best. But unfortunately, you can't win a war by every soldier going off and thinking what they're doing is the best thing on that. Because as to what we call a cascading effect. And we're going to throw this into the context of quality, although it works in almost any aspect of systems development. We are people and we make errors. And when we make errors, if we don't catch those errors, those errors propagate through the rest of the system. So I'm dividing this one up here into requirements and there'll be some errors. I'm not suggesting that 50%, but I'm also guaranteeing you that whatever we do is not 100% perfect. Then I move to the next stage. Now remember, requirements is about specifying what and designing is about specifying how. So now I've done some what stuff and I'm going to say how to do it. How to do it then, right? Turns out to also have some errors in it. But if I don't catch the errors in the what in the first place, I'm going to do how correctly for the wrong what. And I hope that makes sense to everybody if it doesn't ask us questions until we get to the break because I'm going to build the rest of the slide here for you and say that what much of IT systems development about is trying to minimize the amount of red on these diagrams. And each one of these blocks doesn't represent a percentage, but it references a category of errors that are occurring into here. And the real interesting part about this is that while we get this piece going, we then go and study and say, how much does it cost? Well, it turns out, Mike, if you had given me a set of specifications early on and it cost us a penny to fix those errors during the requirement stage, if I didn't catch them until design, it would cost 50 cents. If I caught them at the coding stage, it would cost me a dollar, a hundred times more at the implementation stage than it would at the requirement stage. If I didn't get them until the testing stage, it would be 200 percent. If I got them to the acceptance of it, this does this fit. We're handing it to the customer at that point. It's 500 times, and in the maintenance phase, it's $20 to fix that one penny error because of those cascading effects, all of the complexity, et cetera, et cetera. It's just like compound interest except wrong direction. It's always, I mean, it's so intuitive to think about it that way, but when you really see the comparisons right next to each other, it really hits home visually. And our measurements show that we only detect 50 percent of the problems during the maintenance phase, which is why maintenance is 80 percent of the expenditure that we do on IT and all of that. So let's look now at some principles relative to that. This is kind of a complicated diagram. We don't want to blow you guys out of the water, but this is the kind of explanation that often is tried to be imparted to people. Now, the goal of the slide here is really to talk about different aspects of different focuses on how the governance is done. So the first model that we put up here, and again, very, very tiny type. Don't try and read this. That's not the issue. We're giving you a reference piece and you can go back and look at it later on, but a totally decentralized model. Again, the things that Mike was describing, the proper term for that is entropy. Things happen because it seems natural. People are trying to do the best things and they have a good intuition as to the way it should be done. Different backgrounds that would come along. Mike, again, your background as a physicist is going to give you a very different perspective than somebody else who's coming at this from an organic chemistry perspective, even though you're both coming from the science community on that sort of thing. By the way, in case you guys can't tell, I like people with lots of backgrounds. It's generally a good thing and data to bring them in that way. So we get to the next model then, and this model is a federated model. The idea here is that we're going to have some cooperation that occurs between different parts of the organization. There's sort of a little bit of top-down, but you can see also there is some side-to-side communication in this instance here as well. Then there's a centralized hybrid mode, which says, okay, for these things, we're going to keep these things centralized. Maybe, for example, our easy way to conceptualize this is our definition of custom will be used throughout the entire organization. But your definition of vendor might apply differently because different countries have different ways they implement vendor in there. Of course, the extreme example of this is totally centralized. Again, highly dictatorial. Some companies are very good at implementing this. Some companies are not. Again, we show you this to show you the range of the way things can occur. Your organization might hypothetically be somewhere between decentralized and federated, and that your road map says what we'd really like to get to is over here. The point here is to understand that none of this occurs quickly. It can take a long time to do this, and that the people who are going to be doing this kind of work are the people who you need to have making sure that stewardship is an important part of what they're doing. When we look at this in practical terms, everybody shares the data that goes around back and forth, and we're trying to identify the specific data elements and stewards that are there. Mike, you and I run into this all the time when somebody would say, well, I've got SharePoint, and we both kind of go, oh, sorry about that. By the way, SharePoint is fine software. We're not making fun of that, but SharePoint, without the governance structure around it, becomes a very unwieldy project and basically a place for everybody's own slides to go. Right, yeah. You have that issue, and then you have the federation of dozens and dozens of spreadsheets spread throughout the company, and this kind of problem with data governance, much more complicated. Which really says if you have SharePoint, you need to have a steward for SharePoint in order to do this. Again, SharePoint is one of the many things that you can use to do this. Kaliber is another terrific set of technologies to look at from a data governance perspective. Then we have a subset of everybody that becomes the data stewardship committee. What they're doing is they're trying to get ownership of a component of the business. Again, it might be manufacturing, it might be production, it might be sales, different ways of dividing up the business, and that these business data stewards represent a function in there that works as they can to keep things in play and keep things running and keep things running smoother. Now, again, remember back to where Mike started this off. You thought you had a great plan and everything was wonderful, and then you were successful. All of a sudden you had a lot more than you thought. You didn't have time to keep everything as neat as you did. Entropy sets in and we have a sort of a mess on this. Well, can we make things better? That's really what it's about, is trying to make things better in that type of context. The Stewardship Committee is going to have a relationship with a group of people that are the data stewards. They're going to be specifically, it's a little bit misnomer to say define data elements, because what we're really trying to do is to make a formal definition, but we're not doing this in an abstract sense. We're saying we are using this field for the following pieces. One of my favorite examples of this was a health system we worked with recently that had a field that was very clearly labeled admit date. It's pretty easy. And as the dictionary said, this is the date somebody has admitted to the health system. Well, they went around in practice and found that the stewards did an investigation on 12 different uses of this. Now, that was, in and of itself, interesting. But the cost of that alternate series of uses was millions of dollars a year for the health system. So it wasn't just that we're not using it the same way. It's that not using it the same way costs us money. Right. And that becomes an important trend. I think probably everyone on the call has had the experience of having a data field that is being used in a way that it wasn't originally intended for. And quite possibly being used in a way 180 degrees different from what it was originally intended to be used for, that is just very common. That's the benefit of us, of course, being consultants is that we see a lot of different things. And this is where you learn this. Whereas if you're only done in one area, you tend not to have that bigger perspective on all of it. Right. And I think one important thing about, when it says define data elements, which you were getting to, is that it's not about the data not having existed before. And it's not about the data steward being in a role of saying, well, we've never recorded admit date before. We need to start recording that. That's maybe tangentially related to it, but that's not really the main goal. The main goal is that data is already there and the data steward are there, along with data governance system, to make sure that it's being maintained and controlled in a way that the business can use. Because if we don't, it costs us millions. Right. And that's the part that we're all practicing to get better at doing this. It really is the definition of the team. The governance team pulls together all of this activity in a way that helps the organization manage better its own non-degrading, non-degrading, durable, strategic asset. I've only been saying that five years. That's why I'm saying that. Right. It rolls up your tongue a lot easier than that. Exactly. A lot of practice on that. This is where it gets into the real heart of our presentation here, where we're talking about the data steward. So, into the original definition of, you know, when we define data governance, the data stewardship is really, it's operational. I could say it's where the rubber meets the road. It's facilitating the everyday operation of the rules and structures defined by the data governance program. In addition to the pure management of those governance guidelines, they commonly end up having the most knowledge about the data and how to extract knowledge. A steward gives formalized structure to the tasks that had previously been unformalized. Going to that. Well, key there is that, as you said, formalized an activity that was unstructured before. Right. The way that it was a little bit looser defined. And that's what we're trying to get away from. Yes. The other part of it is, of course, we need to have some training around it. We can't just grab anybody off the street and say, hey, you look like a data steward. Get over there. That's a great picture of a data steward there. I love that. I haven't looked for a long time for that one. That's a super one, right? Right. The one I was talking about before, which I kind of like, it's a little off the beaten path a little bit, but I couldn't find a good image of it. I always think, in my mind, when I'm thinking about the stewardship, is the handlers. You see a theme park, and you see the dress-up characters and costumes. There's always a handler along with them. That is, in some respects, their job is to protect that from the three-year-olds, protect them in the shins, and from the teenagers making fun of them. In some respects, it's to make sure that they are heading in the right direction and being presented to the public, because those characters are a valuable asset, and they're taking care of them. That's the off the beaten path image that I like to think of when I think of a steward. If anybody has any images of that, we did search the internet for those images. We couldn't find them. Give them a couple. They were surprisingly hard to find some good images on that one. Absolutely. A couple of slides here on a stewardship position that we pulled out of an e-mail just from two days ago. This is a company that wants to have somebody mid-career, two years of experience as a steward. I'm not going to read this to you at all. You have them that you can take a look at. Notice a master's degree or an MBA is a plus. They're trying to find somebody who's got eight to ten years of experience querying data, identifying anomalies, gaps, et cetera, et cetera on this. Again, a lot of essential functions in here. If you look at the one I've highlighted in blue, that performs data analysis for various enterprise-wide quality initiatives, that's a full-time job right there. I don't know where they're going to find time to do any of the rest of this stuff on here. Look at the background in here. Does it show up in here? Not sure. It needs to be in there. Actually, I really believe that very strongly because I think if you grow up strictly in an IT environment, you tend not to think of the rest of the world as much. I agree with you 100%. Finally, I like this last piece. They're also the same job description. It may be required to sit and review information on a computer screen for long periods of time. Oh, my goodness. What's it like to be a steward? What is it that you have to have? Becoming a steward, let me see. Right. A specific background and data is not necessarily required. A steward is a role that's related to the organizational data. It's not necessarily a position unto itself. It's a role-based thing. In some cases, it can be a specific position where that is the entirety of that position. But not always. There are multiple types of stewards depending on the specific responsibilities. Ideally, you do have someone dedicated to that role. The title is not important. As we all know, the different companies have completely different title structures. I think in that job posting we just had up, I don't think we saw the word steward in there anywhere. I did talk about data governance. It's not always as common a term nowadays. More importantly, within a company, steward can mean different things. I think Peter, you were saying that in Europe, steward means something very different than it does here in America. It's like you serve wine and cheese. Right. It's important not to get hung up on the word and the title of what that might be. I don't know as much about the state of data stewardship certification. We're moving progress. There are different types of data stewards which we're about to get into that take care of different responsibilities within the organization. As you said before also, it is trying to get some training in place that works on formalizing that specific accountability. Right. It is something that's not necessarily part of the mainstream. So some training around what that really means is key. So there are a couple of different types of basic, what we call data steward types. The business data steward manages the business elements of the data. Data definitions, data quality programs, that sort of thing. The technical data steward is focused on the technical, of course. Systems and models, code, data architecture. Are you telling me that the technology doesn't use the same vocabulary that the difference there? Not anywhere out of this. We're laughing. Sorry. Good luck with that one. In the column it's X124075 and you're going, that's a customer. Right. And you're going toward an ideal state where everyone, you know, the stewardship and the governance gives you that common language around it. But most of the time you're not starting there and even when you do end up there, operationally the individual places of the business, the technical, individual projects will still have, you know, a local lexicon that will define that in a different way. And so it's important to have those roles split out. And then lastly for the basics is more of a project data steward. The project data steward works with the business and technical data stewards. They're implementing that steward role within a particular project or perhaps within a series of projects. They're communicating back with the business technical data steward. In a sense, the project data steward is acting in place of both of them. So they're kind of acting in either one of those roles, but as the person representing them within that project. And project implies there's some sort of change, some modernization that's going on, some evolution. M&M&A may have put these things together or you may have changed systems from people's office to SAP. And this is where you're getting focused in. Right. Or not even necessarily change, but new projects being added, new systems being put into place. And that steward's role is making sure that within that new or changed system or in relation to that changed system that the governance policies are being followed. I want to note that we're pulling a couple of these definitions from our friend David Plotkin's book, which we'll give a little plug for. And I'm pretty sure Shannon has out on the diversity book store data stewardship and actionable guide to effective data management and governance. But we're not done with the definitions, are we? No, we still have ancillary. Kind of psi to definitions. Psi types of stewards. Domain steward, what it would be more about? I'm sorry, you're pointing it right. Domain steward, again, across different business areas, looking particularly integration across multiple pieces. Right, that are going to come into play there as well. And then an operational data steward is really focused on performance in many ways and cases. Now, one of the things with all of this, though, is it's not necessary for you guys that are getting started in this to have all of this stuff down at the front. We wanted to show you a range of these things, but the general responsibilities do come in place. Right. Certainly shouldn't get to the point where you're saying, oh, well, we don't have people to represent all five or all three of our data steward positions, so we can't implement this process. That is not where you want to be. It is important to try to implement those roles and get to ideal state eventually. You need to start somewhere. Data steward responsibilities. On this one, so for all data stewards, they're responsible for the quality control, coordination of the other stewards, making sure you're working within a team environment. So if you are in an environment where there are multiple stewards and you've gotten to that point where either you're a large enough organization or you're in a data governance system that's managed well enough where you have multiple stewards, you're communicating with each other and working as a team across the board. And just to note, in the Commonwealth of Virginia here, we have a group called the Commonwealth Data Stewards. The state government has in fact established this role and adopted the responsibilities for all stewards at all levels. But just like you showed in the other pieces, they were divided up into a number of different functions. Right, right. The next group would be for data entry. When data is coming into the organization or into the system, when metadata is being defined and captured, the stewards have responsibility for making sure that that data is adhering to the governance policies. That's sort of a vacuum cleaner function. Then once we've got it in there, there's a transformation function. Right, and that's just the other role of responsibility. It's essentially the same agenda for the steward where they're making sure that the policies and the quality rules are being followed. A quick little story on this one. I worked for a company at one point where they had a steward that actually put the wrong rule in place and cost the company to understate their earnings for several years. That's an oopsy, right? I've been on a few projects where data integration is where this really lives. You're doing transformations, you're doing ETL, and I've been on several projects where the data comes in, either the initial data wasn't up to par on the governance rules or it's been transformed in the wrong way or it's missing something, and then six months down the road they find the error because it wasn't found out during transformation. Not only does the entire data warehouse and data markets have to be backed out for six months and reloaded, every single piece of analytics reporting coming out of that for six months is suspect at that point. An incredible amount of cost that back. The real interesting thing that you noted there, Mike, is that IT doesn't pay those costs. The business pays the cost. That's one of the things that we're trying to get people to be a little bit more aware of is not to say the average data warehouse should not require seven bills and yet it still does even today with that kind of thing. Big price loops precisely because of some of the things that you said. Right. And then finally, of course, on data consumption. At the end point, not necessarily at the end point, but along the way where data is actually being used, data should be being used correctly by whatever the repository happens to be. It's a lot of guidance that occurs in all of these different areas. And really what we're talking about, it goes back to managing data with guidance. The stewards are the ones that develop and make sure it implements the guidance that is around. Right. It can be as simple as the data steward has knowledge of what that data element is based on the governance policy of here, we've defined this, you know, access point as this particular, you know, hospitalization date. And what does that really mean? As everyone knows, you know, you can look at a single data field. And particularly when you start in with the unmanaged state, that could emit five different things to six different organizations. But at the point of usage, the data stewards respond to make sure they're using that data properly and using correct definitions because they're not using it when they think it's A and it's actually B. So you've got some characteristics of stewards now. Accountable. The data steward really needs to become a focus on fulfilling the duties of maintaining, meaning and quality for that data. Person responsible. Identifying that there is that official process. And it is acting as a single point of contact. So people know where to go for a business function. We have a thing in IT to say, which throat do I throttle? Right. We don't want to make this sound like it's all bad for everybody, but it is very helpful to have this person that's in charge of all customer data for the organization. By the way, we'll give you some guidance here. Never ever let anybody define something as high a level of abstraction as customer. It's too abstract to be meaningful. And you just talk about current customers and VIP customers and past customers and all sorts of other things, qualified customers. But if you just talk about customer, that's where you're absolutely going to get shot. What's our next one? Authoritative. Not only knowledgeable about the data answer questions, but having the authority to assign and oversee work and, of course, to be able to enforce those decisions, they have to be empowered with the ability to actually perform their job in other words. So how much executive support do I need to have somebody doing this? That sounds like a trick question. The more the better. Right. As much as possible. If the boss comes back and overrules you, right, then you're in really bad shape. So it's a good idea to tell this and make sure everybody understands all the way up and down the chain that the stewards are your frontline in terms of doing this. Right. Yeah, you don't want to be in the position of putting a steward really anyway in your organization to have that accountability without the authority, because you're just setting up for it. They're in a losing situation and you won't be able to... Nobody likes it. Absolutely. And finally get some personal qualities, too. Yeah, that organization is, of course, really important. There are a lot of moving parts around being a good data steward, because, you know, the data has a lot of moving parts of them. They are a longstanding asset. Proper organizing minimizes the negative effects without it. There really can't be... There's a lack of trust in the data and in any results from that data. Of course, none of our business has ever changed, right? No. But that's where... Probably got you to the point where you're having these pain points is because, as we talked about in the beginning, you've grown to the point where the data is past your capacity. So that organization is key. I worked for one organization where I had a little interesting story trying to tell them the value of stewardship. And one of the things I said was that you have a lot of employees, several thousand employees that this organization had. And those employees were governed by a series of HR structures because they were relatively speaking important, but customer-facing. There's a lot of tribal knowledge tied up with them. And I said, so it looks to me like you have about one HR manager for about every 100 of your employees that you have in there. And that seems to work well. Your employees you're pretty satisfied with and it's working pretty well. I said, you've got some really big piles of data and you've only got three people watching the data. It doesn't seem like the right number. And that analogy worked well for them. They kind of went, yeah, okay, we do have different people and they have different needs. So we ought to have some stewards that would come into play and take a look at all of this. Right. So again, authoritative, accountable, organized, good practices to have on all of them. Let's talk about how we take stewards and try to help get them towards what we talk about as a data-driven culture. And we've actually got a little bit of coming out on this later this month where we're going to talk specifically about what data-driven means. But in this instance here, there's a couple of options. Yes. And a minimally intrusive option, it's really about getting people into those roles naturally rather than forcing them in. We call this sort of a bottom-up kind of a process where people kind of say, if the data was better managed, we'd actually be in a little bit better role. Of course, the opposite of that is top-down. Somebody says, oh, we're going to do data governance, we need data stewards, so we're going to find some people and you look like a data steward and you look like a data steward, therefore we're going to go. Right. It's an authoritative process. And the first one, the bottom-up, the bottom are really the people who are probably going to be feeling the pain points first anyway. And so often it will be them that are driving this. Many organizations however, find themselves in what we call the two-by-four category. And that's kind of an interesting one because that means somebody's going to beat you about the head and shoulders with a very large object and say, you know, the IRS is mad at us, the government, the regulators, something along those lines. This is not optional. You guys will be doing this. That's actually kind of a happy state to be in because it makes your and my job a lot easier. We can look at it and say, well, the government says you have to do this, therefore. But very few times do we actually get to have that kind of a luxury. Right. That's not typical. It's much more about trying to get people to control them into this that it's a good idea. And again, if you go back and say to them over and over again, data is an asset. You have other assets that are managed. Somebody is managing your data asset. Right. Yeah, my experience has always been that you definitely want to be in the position of helping someone perform a change and institute a policy that they have been convinced that they need to do it. It may have been hated to them from on high. It may have been something internally they were convinced. You don't want to be in the position of coming in there and trying to convince them this is what they should do because that's a much harder process and probably hitting people with two-by-four seconds a lot. Right. I would imagine. So tell me about this picture because I think you pulled this picture out, didn't you? I don't remember. That might have been doing as well. I wasn't yours. I wonder where that one came from. All right, so what is a steward doing? They're trying to take the policy that is difficult for people to understand because they don't really get the data as an asset and so they don't realize that managing it well helps you improve the quality of that asset and consequently, they're going to take the operationalizing role of it. So let's go back to our initial piece. Requirements for what design is how? Stewards are designers of ways of making data better utilized within the organization. That's the nature of that operationalization role in there. You talked specifically about proactive versus reactive roles, right? Right. So proactive, you really want to build that into the system so that they don't get to the point where the data becomes too unmanaged and you're reacting. So the matrixes that we talk about here are going back to that formalization that we talked about. Again, right now I know for sure that Jan handles this particular type of data and that works really well until we realize that Jan is actually near retirement age and Jan has 40 years of experience and it's really going to be useful for us to sit down and formalize some of Jan's knowledge into a series of matrices, racy charts, what her perspective is on the software development, how she works in project planning in order to come up with all of these various ideas. Because if we don't formalize it, when Jan retires, she may or may not decide to talk to us after she's retired. Right. That's the nice thing about retirement. You get to make some choices. And it's not just retirement. You can't get hired away by another company or something along those lines. The point is people are a weak link in your chain and so the more you can do to put in place things that allow the organization to formally understand this. Some of you may not be familiar with some of these tools in here. We've got a much longer version of this where we can go into this in more detail. But we talk about a governance activity matrix in particular. We're talking about who does what under what circumstances. A racy chart is responsibility, accountability, these types of concepts that are in the software development chart. It's that you can't actually use somebody else's data unless you know where it came from and what it was designed to be used for in there. And again, project planning and all of the rest of... And the point here that you made, which was that we're not necessarily having to talk to them about governance processes, is really just to say these are good ways of doing business. But we're going to talk about them as governance because it falls under that rubric right now. But a lot of people look at this and say, this is really just common sense, isn't it? Right. I mean, it's really just basic processes and organizations to make sure that data is what you think it is. Once you get to terms like data governance processes, to everyone else in the business, that becomes just almost government-like gobbledygook. So it's good to keep away from that whenever you can. But your point here is also that many people, when they find out that they are going to do data governance, they think that's something done to them. Right. And they become fearful because change makes people fearful in many cases. And for the most part, it's things they're already doing. It's more of a coordinated effort than what people are already doing. It just makes comments about it. Oh, I already was a data steward. That's really cool, right? Then they ask you for a raise, which is cool, right? You want to pay for that particular good set of services. So to contrast the proactive, of course there's a reactive approach here, which may be that you're just coming about this by saying, you know, we're spending a lot of money in our organization doing what Tom Redmond calls hidden data consumers. And I apologize, Tom, it's not exactly the term that you're using in your new book, but it's that every time we get the data from factory, whatever, we have to go back and improve the quality around it. And because we have to do that, it takes us an extra two or three hours before we can do something else by moving the data on to somewhere else. These are hidden consumers of resources that are in there. And the reactive part is, yeah, we're just going to keep doing it. The proactive part is, you know, instead of keeping giving us polluted data, we ought to go back and find a way to make sure that the data they give us is a much better quality in order to do that. The key is, of course, this affects the domain stewards in particular who are working within an area, and maybe the cross-functional nature of them will allow that problem to be translated a little bit easier. In a reactive approach, you may conduct a root cause analysis and try to find out what is the precise problem that's happening in there, and then recommend solutions so that you can go back and say, you know, we've got five people that spend two hours a week doing this, and they get paid $100 an hour, the poll is voted, okay? They're paid $100 an hour, that'd be really nice. And in order to do that, this would save the organization X number of dollars over time so that it does make sense to go in and resolve the issue in there. So that's sort of a bottom-up, bubbling-up type of an approach rather than a top-down type of an approach on this. But again, either way that it comes out, generally the result is good, and we're just showing these as examples of where you might be in your organization on this. Now, let's talk specifically about how this starts to drive a data-driven culture. I'm going to shout out to Gwen Thomas, again, another wonderful friend. She's probably more than anybody else gets credit for popularizing some of the ideas in this area. She likes to talk about Big G, which is the idea of high-level governance, and then we also have a little G, which is the bottom-up. So again, top-down versus bottom-up. Top-down, I'm going to oversimplify, but an executive on an airplane reading the airplane magazine that's in the feedback box in front of them says, data governance will help your organization become more profitable, and the executive says, I'd like to be more profitable, so I should do some data governance. I go around and say, you look like a data governance person, Mike, so you're now in charge of doing this and you start off with the policy and things in place, put things out there for everybody to try and work from. Or the bottom-up approach, which is the one we just described, the reactive approach, would be somebody coming to the manager and saying, you know, I fixed three data quality problems, and each one of those data quality problems has made us incrementally more profitable, but if we could attack a dozen of them at a time with a team, we would become more profitable faster in that process. The key, of course, to all of this is what you were talking about before, with that translation between the business. Again, it's a control layer. They may be dealing with fields called Fenergal and GERC, right, which nobody's going to know what they mean until we translate them up there and say, oh, well, that's a customer profitability index. We're going to pay a lot of attention to that aspect of it in order to do this. Go ahead, Mike. And I think it's important to note that, like almost everything else in life, it's not, it doesn't have to be this or that. It doesn't have to be that it's always coming from, you know, top-down and plan for that and drive it that way 100% or bottom-up. Often it will be a mix of both. You're where they're trying to define policies from the top-down, and while that's happening, you have the currencies they're happening on from the bottom-up and you're trying to solve it at the same time, and they kind of meet in the middle somewhere. And it's not one time. It cycles through. Right. This is really what you're talking about is institutionalizing the practice in the stewards so that they start to get better at what they do. And if they get better at what they do, their job becomes easier and their results become stronger. So another picture here that's a little bit sort of problematic here, and I don't want to take anything away from the picture here, but this is one of the things, can you imagine it's three o'clock in the afternoon and you're waiting for people to get something done and they're gonna, okay, now I'm gonna walk you through this entire diagram here, right? What really on this is to understand that a policy-level action is translated into dimensions by the stewards and that the stewards drive the particular cultural data. Data governance is specific and personal to each organization that takes it on. Again, Walmart is not gonna be the same as Target, it's not gonna be the same as Kmart, it's not gonna be the same as Tiers, right? There are very different organizations and each one of them is going to do this in a very different fashion. But does the stewards have that knowledge, skills, and ability to take the cultural aspects of the business and figure out what's going to work faster in order to do this? They become the basis for turning the data into the data culture of the organization. A couple more slides on this just to give you an idea of how to improve this. I showed this one before, which is these are the things that we're going to try to measure, and I'm just gonna give you the measurement framework here for the CMM, which we're gonna apply specifically to governance. In governance, you get one point if you have a pulse. That's a pretty low standard. However, if what you do in governance is managed, it is a repeatable type function, then you get two points. If you write down what it is that you're going to do and provide guidance that you can hand to people, for example, that aren't local to where you are, you now get three points. In those managed activities, we can now measure them. And if we measure them and say, how long is it taking? I'm not talking about standing over somebody with a stopwatch and saying, Mike, have you defined that data element yet? But it's looking back historically and seeing what happens on a periodic basis. Now you can say, wow, we took some measures and we're getting better or worse at what we do. This framework is not new. This is the basis for TQM, ISO 9000, et cetera, et cetera. And when you take these two sets of components, the data governance area, which has some stewardship-specific dimensions, and we rate them, now we have a framework that we can put in place for organizations to measure themselves. So here's a group of insurance companies that we did that don't have to have such good measurements. They're just really not helpful in there in order to do that. But the point is there is a way of deciding a password to help improve what it is that you're doing in this area. Now the stewardship maturity model really talks specifically about defining, looking at things like business glossary and roles and things like that, and then looking at authorities and control changes. In some ways it's a little bit like an IT auditor doing this type of work, although don't use those words, because those words scare people more than data governance does. We've both seen that one. And then get into the measurements portion of it. What are we doing to measure the quality of the data? And showing that you realize you're making important decisions based on poor quality data, that you're repeating the data management processes, putting them in place and saying, let's get better at what it is that we do, and finally enhancing the overall performance of both the data and the stewards that are managing the data in all of that process. So again, just a brief view there into what's going into the maturity model. There is a nice formal model that you can use, and we're real thankful for the CMMI for putting that out there for us so that we can look at this. One last piece on this, we see this chart or a variation of it all over the place where people will come in and say, here's my plan for doing stewardship. You say, oh, wow, that's really nice. But you know what? You don't actually know a whole lot about what you're doing. First of all, decide whether or not you are going top down or bottom up. And if you are, don't try and plan all this stuff. This is new for your organization. It's a new set of skills. Again, I contrast it with accounting that has actually been around for over 8,000 years. The accountants are good at what they do, and they know that what they do works. We're still learning this stuff. So I don't mean to say that any answer is correct. We know from our experience that not all answers are going to be correct, but that it is much less important to have a rigid plan than it is to have a flexible plan that allows you to move forward incrementally and figure out what those steps are piece by piece as you learn how to use your steward. One final piece on this that we'll dive into is something we call the data doctrine. This is sort of a philosophical approach to the subject. You may notice in similarities here to the agile doctrine on this. They start out with exactly the same component. We're uncovering better ways of developing systems by doing it and helping others to do it. What we value more is that data programs actually do precede software development. The stable data structures are actually more important to develop than stable code. That data reuse is much more important than completed software in this case. Shared data is more important than completed software, and data reuse is more important than preceding the reusable code. It's not that the things on the right are bad, but that we value the things on the left a little bit more. Now, this is philosophy, and one of the things that you can do as sort of a test if you're looking at stewards is to look at this and see if they understand it. The first thing they'll notice is that I spilled programs definitely incorrectly on there. I did that deliberately because I want to distinguish that from software programs in specific in order to make sure people understand that. But if people look at this chart and say, I don't understand it, they're going to need a little bit more training. But if they look at this and go, oh, my goodness, yes, that says things to me that I've been trying to articulate for a while, now you've probably got somebody who's really interested in the process. One thing I think of when I see this is I really like how another way to think about it is that the items on the left, the stronger they are, they will make the items on the right even stronger. Do they have them on the right depend on the things on the left? Certainly agree. So we've covered here a bit of territory. Again, a quick overview of data management to show you that governance is a part of data management and stewardship is a huge part of governance. We've talked specifically about the business needs for stewardship, giving you some principles and also opportunities for how the stewards really do set the cultural definition of what's going on in your organization. We'll close here at the top of the hour with just a series of benefits in getting a data stewardship program because somebody is going to ask you sooner or later what do you get them, right? Well, data stewardship is a program. Something is going to continue over time whereas a project is going to have a very definite beginning and an end. The programs are tied specifically to a financial calendar so it means it needs to be funded in order to do that. And that funding will provide a better source of the organization. Again, most organizations do not choose to have a human resource program or project, right? They choose to have a human resource program in order to do this. And this program management is by definition governance intensive and that consequently is going to give people the idea that from a governance perspective it's something that wasn't governed before. One of the fun projects we did early on was that we did governance for the Army. And then the Army, everything's governed. And they went, oh my gosh, something's not governed? We have to fix that right away. Either a sale we've ever done in that type of context. But the programs are going to have a greater scope for financial management around all of this and that you've got to have a change management program in this because if executives tell you to do it and then they'll tell you to have some change management it's hard to get people to change the way they do things. So we are right at the top of the hour, Shannon. And let's see if we've got some questions for everybody that wants to talk to us about stewardship. My guy's calling me. The slide's out of order. You're fine. Good deal. Sorry, Kim. No worries. Indeed, we do have questions coming in. And to answer some of the most commonly asked questions, we will be sending a follow-up email by end of day Thursday with links to the slides and links to the recording of the session. So first question coming in here. Who is a data owner and how does she fit in with stewards? What would be the critical roles and responsibilities? So I'll jump in and say that one of the things we try to discourage people from using is the term ownership of data. For example, if you're in the accounting group you don't own any of the data that comes to you. It all comes to you from other parts of the organization that are saying here are the sales, here are the expenses, figure out whether we're making money or not and can you make payroll while you're at it? Non-trivial stuff. And I say that a little ingest, but a little not. We like to have process owners. Process ownership is a key, crucial concept. But a data owner is actually something that becomes problematic for organizations because as soon as it's your data, like we're going to arm wrestle over this next little piece here and see who actually ends up with ownership of that piece. Data belongs to the organization. So that is precisely the reason we use the term steward instead of owner because a steward is somebody who takes care of it. Again, go back to Mike's example of the Disney character wandering around the Disney park. If you haven't had a pleasure to see these characters they are wonderful. It's a big furry Mickey Mouse coming down at them, making the kids squeal and delight, but they can't operate by themselves. They have a handler there that is just behind them to make sure they don't trip over something or that somebody doesn't want to do that. And also they are looking for opportunities. If they see a kid that's unhappy... Right, because the character can't see that. The character is a little more one-dimensional. And so that person, that steward is looking to make sure that it's used properly. And one way you might be able to get away from the term ownership might be data originator or data origination point or data user. It's a fantastic point because that does give them something that they can attach to without necessarily... Because after all, once you use it and it goes to another department, are you still the owner? No, but you are always the originator. Right. Gosh, I have to say this, but you'll always be my first. Right. And what is really the point of ownership beyond stewardship anyway? I don't know. What would you do with that data that you're the owner of? There also can be subject matter experts, a person who understands the data well. You started getting into the data stewardship role a little bit there. If that answers your question, if it's not, please do push back and we'll give you some more. Certainly. And from the same question, different lines of business, for example, marketing and sales might look at same data, for example, the customer in different manner. In that scenario, would it be wise to form council of data stewards? I'm sorry, the question was, under what circumstances should you form a council? Yeah. So if you've got different lines of businesses looking at the same data, so if sales and marketing are both looking at the same data in terms of the customer data, but they're looking at it from different angles and different perspectives and making different business decisions based on that data, would it be wise to form a council? Almost always. Again, that's precisely why it has to happen is because you need that coordination that could be parable of the blind people feeling the elephant, right? One person's got the tail and the data looks to them like the tail and this ropey and that sort of a thing and somebody else is holding onto the leg and it feels like a tree. Somebody else is on the side of it and it feels like a wall and then there's the ear that's sort of a wall but tent leg and things like that. That's precisely the focus of those councils is to do exactly that coordination in there. To what degree is stewardship in a federated model viewed as practical function versus a compliance function? Depends on the organization. So we're seeing an awful lot of organizations that are saying the fastest way to become compliant is through data governance and that they're using that as a driver and it's also easier to get money for it, frankly, because the Board of Directors understands that failure to comply can have serious impacts on the organization. I'll give you the most extreme example there, the target data breach. Everybody's familiar with the target, in fact, one out of three people in the United States have their charge card replaced because of it. They went after the Board of Directors and the CIO got fired, the CEO got fired and there was nobody left to go after so they went after the Board of Directors. Well, the Board of Directors, you better believe that whoa, wait a minute, we're liable for this stuff? Absolutely. So that was one of the highest profile reactions that has occurred in there and so that was really sort of the wake-up call, I think, for an awful lot of organizations to see that. Yeah, at minimum you will have processes and documentation around some of these data issues that that becomes the heart of the issue. Compliance. There's a word in the compliance area called attestation. Somebody is swearing or attesting that the data is in fact correct because after all, we already told the story about the 5% under-reporting sales. That's a problem. By the way, under-reporting is probably better than over-reporting but nobody wants to get it wrong. That's the whole point. And we get this next question quite a bit. How do you determine how many data stores are needed? 12. I like 13. 42. Yeah, exactly. Thank you. So that's really where it does become personal to the organization. Key, though, is don't go in and ask for 42 right away. Start out small. Show that they do good work and say, if we had more, we could do better. I mean, I think that's where you get to the point where you suss out business requirements. You figure out what needs to happen where. It's usually better to err on the side of smaller, less. And then as a single data store, it becomes obvious that there's multiple needs that aren't served very well together. You have them split up. It's just so dependent on the unique aspect of what's needed in the organization. And I would also not go in and say, we need 12 of these, right? Let's get there. 12 business stewards and 16 technical stewards and 14 project data stewards. You're assuming things about the organization that the organization doesn't even know itself. Instead, start out with just stewards. Right. Get them to train somebody's in the idea. And then when you say, you know, the stewards are kind of busy. Maybe we ought to subdivide them up for efficiency's sake and have some of them do some things and some of them do other things. Right. It's a bit of a punt. But, you know, you have to figure out the more exact requirements around it. It's like someone saying, you know, I need a data warehouse. How much is that going to cost? There's a lot of aspects around that. 42. Right. I mean, zero's behind it. Thanks, Shannon. Sorry, I'm giggling about 42. If you have heard many of Peter's speeches, you'll know he's printing a book there. If no principles or provisions exist for data governance, is it appropriate to begin encouraging the formulation of these or should this always come after the business case for data governance? So, do the stewards proceed the data governance case? If no principles or provisions exist for data governance, is it appropriate to begin encouraging the formulation? It's a hard one. I like to go back to, right now, your organization is managing its data very well at the workgroup level. But on an organizational level, it would benefit from managing the data with guidance. And the stewards are the people who implement that guidance. So I don't see how you can expect to have the whole non-depletable, non-degrading, durable strategic asset of the organization managed without guidance. But we clearly have a lot of customers that are trying. Right. They kind of go hand in hand. I mean, you could go ahead and say that there's a steward in a position who is starting to do its job, but its job is related to the principles and guidance of governance. And so maybe they're happening at the same time. You're setting yourself for a bit of a difficulty. Maybe, in a case in a microcosm, one small division of a company can't wait for the rest of the company to move forward on a more comprehensive governance program. And so they say, we're going to set up our own steward. And they are kind of, because they're a smaller role with a smaller amount of data, a number of people, that they're kind of filling in the role of steward and governance facilitator and creating the governance policies. Maybe in that kind of case it might make a lot of sense. But at the end of the day, I mean, the role of a steward is to perform the duties of the governance. And if there isn't any, there's nothing to do. So maybe they're growing at the same time. Kind of chicken and egg. Right. By the way, did you know they figured out that the egg came first? I didn't know that. You go look up in Wikipedia and get that. Your point is a really good one. We have seen organizations where we'll come in and work with them. And actually the World Bank was one of our groups that we found this occur where they didn't have governance and stewardship in many of the other parts, but they did in part of the business part. And so the happy news for them was, you don't have to hire expensive consultants to come in here and show you how to do this. Walk down the hall and ask your colleagues. Right. You've had internal expertise on this. And that was super. So one of the other things that we'd like to do, we talked about abstraction before, and I was fussing about the customer being too high a level of abstraction. All the stuff we've had around the chief data officer sort of implies that that's going to be at the top. And our preferred term is actually enterprise data executive, because that individual can exist at a division level, at a group level, and be the person. So if you've got a group that's doing really, really well, then put somebody in charge of that piece, and that's where the stewardship will occur. By the way, that's an excellent opportunity in your organization for an A-B test. These guys are doing stewardship, and these guys are not. If there's no difference, it may be a little tough to show the value there. I'm good at that, as you are, too, that there's going to be a difference and it'll be a noticeable difference in the right direction. And so what would you call a role or individual that can make all decisions about a particular data set? So when you say decisions about the data set, I know you're not interactive here, but let's just put them in place. Are we talking deleting the data set? That may be something that's a compliance issue. Are we talking about making a copy of it and giving it to somebody else? That may be a governance issue, but it may not be a wise issue on that. I think they're trying to come up with a... Yeah, making decisions, I mean, depends on the decision, I guess, is really the answer. If the decision is what's the most proper usage of this data, it's probably the data steward. If the decision is we need to figure out a different way to transform this data and it's a very complex situation, it might go back up to the data governance or the stewardship council. It could be that the steward is working with subject matter experts on that. It couldn't need to go back to the origination point on which is the right way to use it. And that's... It's certainly the steward would be the first person you go to and that responsibility to steward at that point would be to figure out how do we proceed? Is it something that I'm supposed to make a decision on or is it something I need to go back to some other person on? What we'd rather see, though, is that rather than looking at this by data set, what we'd really prefer to see is the data as a general class of assets and say, well, let's not talk about that specific data set. Let's talk about data in general and you're going to actually have a much better discussion with whoever's asking the question. I mean, can I have that data asset that's on your laptop right there? Well, there may be some good or bad reasons why you should do that. If I'm going to take it, put it on my laptop and somebody's going to steal my laptop because I leave it sitting on the subway. That's not a good thing. And if you know I'm prone to that sort of thing, you may insist that I encrypt my laptop at least before I set it out there assuming that that's the best you can do in terms of prohibiting me from getting it. On the other hand, if you heard that I was looking for a new job recently and wanted to have that list of all of our customers, you might be entitled to give it to me, but it still might not be a good idea. Right. Sorry, Shannon, we can't give you a good one on that one. No, I think that answered it well. And, you know, certainly the questioner, feel free to clarify if you want more specifics. Moving on to the next question, we are just getting started with a data governance program and are looking for ideas on how to engage data stores from our different departments to introduce them to data governance. Do you have any suggestions on how to engage data stores? So they're out there without work to do? You guys, it must have your data in beautiful shape. Well, like I said, they're just getting started with data governance. So with our data governance program, and looking at how to... Yeah. We're being a little live with that. Surely your organization has some challenges that are surrounding either the data or the business practices. I haven't said this yet on this particular piece, and I know it drives Shannon crazy, because I say it almost all the time, but when I look at IT projects, I find that data is at the root cause of 100% of them. And so if you don't think it's a data problem, I can guarantee you it's got a data problem at the foundation of it that's in there. So find something that's not working well in your organization. And ask them to look at it just from the data perspective and see what they can do to help inform your understanding of the problem. And if they do, you'll be surprised at how quickly they get involved in business process re-engineering activities and other kinds of things that are really the more holistic stuff that we've talked about in several different ways around this particular presentation. The key to all of this is to understand that data stewards are, by nature, problem solvers. And the more they are engaged to solve problems, the more they will like to solve problems, because it's kind of a self-reinforcing cycle. If you solve a problem, people say, what do you? Thank you. It's always good to have somebody say thank you at your desk, right? I would encourage, and this may or may not be obvious, to make sure it sounds like it's data stewards plural in that situation, to make sure there's either something more formal like a data stewardship council or something more informal, where they have the ability to collaborate with each other, have, you know, instead of a code review, kind of a data stewardship review when they're going over problems, not for them to check with each other, checking to check up on each other, but for them to have the experience and be able to collaborate on, well, you know, Susan in that department, the data steward fixed this problem this way. That helps me know how to fix my problem my way. And that can be a great way for people to engage. I just cheated and changed the slide, so I said all ships need data stewards. There we go. There's general rule for selecting stewards from a specific organizational level, for example, specialists, middle management, et cetera. So the job category that we gave you here was looking clearly for somebody with a lot of good experience. Do you always need to have someone with a master's degree and eight to 10 years of experience and data to get started on this? I think not. But what you're really looking for is an aptitude, an aptitude, if you will. If somebody is bored to tears of data and has nothing, no interest in it, that's probably not your best spot. But if you've got somebody that, we get them in college and university where they're coming in and they go, I want to learn more about data. I think it's really cool stuff. That's what you're looking for, somebody that's got that kind of interest. They can learn most of the rest of this without having to put in the eight to 10 years of experience in that. And I assume you're talking about a steward who's feeling a full-time role as a steward. Because we talked about before, it's not always a full-time role. There are lots of times where it's a roll lowercase where they are perhaps a project data steward. And obviously if they're project data steward working within a system, you don't want to bring someone completely outside that system who has no information or knowledge of that system to be a project data steward. That would be completely inappropriate. But I think in your example, you were talking about a full-time person. I believe that certainly they can address that specifically if not, but I believe that is the impression here. Moving on, so here's an interesting question. We are building out and finishing building out the webinar series, this webinar series for next year. There's a question here. Will there be a presentation regarding IT data stores, not business data stores in the future? What are the biggest differences you see there, Peter and Mike? Great question. We actually, as you said, Shannon, are just considering what we should be doing with next year's set of these things. If there's interest in that, let Shannon know because she's the one that becomes the gatekeeper and scheduling these things. The question was IT versus data stewards. I'm going to put that slide back up there and we'll dive into a little bit more on that. As I said before, the real key is that we would like you to start out with just having stewards. Getting stewards in the first place is good. Yes. It doesn't have to be separated between the two or three or five, depending on how you're looking at it, different types of data stewards. But if you specifically think about a business data steward versus a technical data steward, there are some obvious differences there where a business data steward is really focused on more, I would say, the requirements and business information and maybe the subject matter expertise around what data really means and how the data is used by the business. What does the data quality really mean? If it goes over the year 2000 or under the year 2000, that means a lot more to a business steward than it does a technical steward or a technical steward but I care if it goes over the year 2000 or under the year 2000. On the other hand, the technical steward may be the person that understands how to use your teradata system just to pick one at random and really understands the intricacies of how data is stored within teradata. And so both are necessary at some level of understanding within the organization. Technical is going to be obviously more technical. They're going to be more Oracle, DB2, teradata, SQL Server specific in there and they're going to know SSRI versus a business data steward might go SSRI. Is that like a company? Right. I would agree. I would say most of the time a technical data steward would be close to a DBA or a developer or an architect but not every time. Even if they're not, they don't have to be a coder or administrator or an architect of a database system but they need to have knowledge of understanding the technical aspects of that and there are people who kind of bridge that gap but they would be more technical based. Let's go back to the question that I asked earlier about how to find these individuals. So if you've got a group of DBAs probably in there they're going to have very interest in the internals and how data is used and if you've got a DBA that's a little bit more holistic in perspective and wants to get beyond the strictly technical roles that's actually perfectly fine and appropriate. On the other hand if you've got somebody in the business part of it where they're really interested in how businesses and they're starting to have technical questions both of those people are going to be good individuals to dive into this particular piece. Somebody else have a chance? All right. We've got a lot of great questions coming in here. We've got about nine minutes left so definitely time for more questions if you've got them and send them in. So how does data become a part of the SDLC cycle? You cannot successfully launch an application without data playing a key role in app consumption, manufacturing and distribution of data. Most projects focus on the build first so how do we make data first? So that's a really great question and I'm pulling up my data doctrine slide on that because it is kind of fundamental in nature as I mentioned before this is philosophical but let me make a statement that says that the only way an IT project can work from a data perspective is if no data is shared outside of that project or they use 100% pre-packaged data products in order to come into it. So that said, there's clearly a mismatch in terms of the impedance that's in there and that was really what Mike and I were just talking about just before everybody came online. It is very difficult to do and we're going to encourage people to start a dialogue at this. We think the website is going to be up in a couple of days on this where it's going to be the datadoctorin.com if you're interested in carrying that discussion onwards. We're all looking for ways of moving this issue which is a very fundamental issue forward. This is some of the specialized knowledge and training that a steward would have to have is to say, if you're going to try to develop data within an IT project, don't plan on that data being used outside of that project unless you're willing to put something beyond the project into the project. It has to be a program and that's why I spelled it funny on this particular slide. Great question. Sorry we get real riled up on that one but it's a good one. Right. Yeah. I mean, if the data is going to originate in the software being developed, then, I mean, I think from the beginning during requirements gathering, during requirements decision making, during design work, those data elements, thinking about the data elements first, both the ones that are originated internally, even if they're primary data that the business is using, if it's metadata, if it's data being utilized by the system, in all those cases, you think about putting the data first in terms of the steward is involved, the governance policy, the governance policies are adhered to. It doesn't, you know, the way I'm saying it makes it sound like it's going to be this extra incumbrant process but it's really not. It's just about, when you're doing those things, make sure you're checking off some of the things about how that data gets implemented and the policies it's being used by and, you know, structure that data first as a part of that design and make sure that the requirements gathering around it, for instance, we want to, you know, have A plus B equals C. We'll make sure, we're hearing to what A and B and C are at the part of the governance policy, is all we're saying. Because remember, if you don't get them right at requirements, those errors can propagate throughout the rest of your design, and if you don't catch them until it's too late, it's very expensive. Right. This is why you'll notice that the data models for most of the fundamental software package is out there have not changed since they were shipped originally because it's just way too difficult to go back and make those changes. And that too late point can be once the system is fully developed and it works fine in that system, and now that system, that new software piece that's been developed, wants to share that information to the rest of the business so that that asset can be utilized by the rest of the business. And that may be the point of breaking. It's like, oh, it doesn't work with everybody else. Because we didn't adhere to it. That's what Peter was referring to as being, well, yeah, if it's completely self- contained and that data never leaves that system, maybe one thing, but that doesn't really occur. And if it is occurring, the business is probably losing a huge asset on whatever that data is. Great question. Thanks. You guys are passionate about that, are you? We are. We are. So what about data that is not persisted in a physical database? What comments do you have on managing this metadata and stewardship? Oh, not in a physical database. I guess we're talking about things like e-mail stores, dreaming, picking up Twitter feeds, Facebook posts. So really, governance is about managing data with guidance. If you're not trying to manage it, it's going to be a little bit different in the sense that you're really just observing it and watching it pass by you. That said, though, most organizations, even when they say we're never going to store it, they store it anyway. Right. So first of all, I would caution that most organizations can't resist the temptation because, after all, storage is free, right? Everybody has that. Right. I can tell you one of my favorite stories. I can count up top of my head at at least six times I've built a system where the requirements involved were going to have an archiving system. And this is how we're going to deal with archiving data that we no longer feel is needed or reasonable. And every single time, once it's all built and delivered, they said, yeah, we're not going to use that. We're just going to keep going. It's so common that everything gets stored. What I would say, though, Shannon, is that you can come back and put guidance for use. So some of the work that we do is interesting work for different parts of organizations where they do, in fact, look at things that are going in that area. And almost every part has specific rules and guidance to do it. So even if you're not planning to store it, use. All right. So probably, really, if we're going to fine-tune this definition that I've got on the slide right now, it would be managing use of data with guidance. Or guiding use of data would be another way of thinking about it. And so from that perspective, you may have a steward who doesn't actually have control over their data, but they may actually control the way in which you use it. Again, I'll just say, let's pretend we've got somebody's email that we want to get access to, whether it's an employee or a suspected bad guy or something along those lines, there still are going to be some rules that are going to be in play there. And that's really where the steward should be the expert, should be the person who is most familiar with the legal. And let's go ahead and say moral aspects of what's going on in there as well. Right. It doesn't have to be about, you know, this is how we're going to structure the data and this steward's response will make sure it gets structured in exactly this way. The governance rule could be we're going to leave it as it is, structured in this way. But it is still valuable data and the usage, as you're saying, is the important piece of that guidance at that point. Sure. And the questioner did clarify that ES and D social media would be included in that, very much so, in that, within that realm. I'll recommend a terrific book called Pulse by a guy named Douglas Hubbard that starts to address some of those issues. Doug is the fellow that wrote the book how to measure anything. And his pulse book on big data is fascinating in that area. All right. We have just a couple of minutes left. There was a clarification that came in on a previous question about the role in the individual that can make all decisions. The clarification is, you know, you don't like to nominate call someone a data owner. What would you call a role that can make a final decision about anything related to a particular data set? So I would go for the enterprise data executive as being the final arbiter of any questionable or things out there. Just as the chief of police is the final arbiter as to whether somebody's going to get charged with a crime in a town, whether the chief medical officer is the person who's going to make the final medical decisions about things, the chief risk officer is going to determine the priority in which we're going to address risks in our organizations. If nobody is in charge, then the person making the decision gets that role, and you probably want that person to be qualified and again, authorized and authoritative in that context. Right. What should be one person? I mean, that's, I mean, when you say final arbiter, that's really, it comes down to one person. And it would be, you know, the EDD role or, you know, chief role. The buck stops here. Right. All right. Well, that brings us right to the half hour. Thank you both for such a great presentation and Q&A session. Our attendees is always for being so engaged in everything we do and asking such great questions. Just a reminder, I will be sending out a follow-up email within two business days with links to the slides, links to the recording of this session. And I hope everyone has a great day. Peter and Mike, thanks again so much. Thank you as always. Thank you. It was a lot of fun. Cheers.