 Hello and welcome. My name is Shannon Kemp and I'm the Chief Digital Manager for DataVersity. We'd like to thank you for joining today's DataVersity Webinar, Data Management versus Data Strategy. It is the latest installment in the monthly webinar series called DataEd Online with Dr. Peter Akin, brought to you in partnership with Data Blueprint. Just a couple of points to get us started. Due to the large number of people that attend these sessions, you will be muted during the webinar. If you'd like to chat with us or with each other, we certainly encourage you to do so. Please click the chat icon in the upper right-hand corner for that feature. For questions, we will be collecting them by the Q&A in the bottom right-hand corner of your screen. Or if you'd like to tweet, we encourage you to share highlights or questions via Twitter using hashtag DataEd. To answer the most commonly asked questions, as always, we will send a follow-up email to all registrants within two business days, containing links to the slides. And yes, we are recording. It would like why I send a link to the recording in this session, as well as any additional information requested throughout the webinar. Now let me introduce to you our speaker for today, Dr. Peter Akin. Peter is an internationally recognized data management thought leader. Many of you already know him or have seen him at conferences worldwide. He has more than 30 years of experience and has received many awards for his outstanding contributions to the profession. Peter is also the founding director of Data Blueprint. He has written dozens of articles and 11 books. The most recent is your data strategy. Peter is experienced with more than 500 data management practices in 20 countries and is consistently named as top data management expert. Some of the most important and largest organizations in the world have sought out his and Data Blueprint's expertise. Peter has spent multi-year immersions with groups such as the U.S. Department of Defense, Deutsche Bank, Nokia, Wells Fargo, the Commonwealth of Virginia, and Walmart. And with that, let me turn over everything to Peter to get today's webinar started. Hello and welcome. Hi, Shannon. Thanks. It's great to be with you. It was great seeing you in San Diego last month. For those of you that weren't paying attention, it was one of the few times that Shannon and I were actually occupying the same room at the same time. So we're real happy to do this. And of course, we're real excited about the conference she's got coming up next month online, actually next week online, for the Database Online Conference. We should put a slide in there for that. Anyway, I have a lot of people that have been asking questions about what's the difference between data management and data strategy and what's the relationship between the two of them. So that's what we're going to talk about for the next hour. Then when we finish the presentation, as usual, we'll do the question and answer piece going onward and forward around this. So contextually, I'm going to start off by talking about what are some important properties that data has and sort of why we're in the mess we're in. And the answer has to fall squarely on the academic community. We are doing a very poor job in general of helping people understand the specific properties that data has that causes it to be used in the way it is used or not used in the way it is used in many organizations. There's a lot of confusion about where data belongs. Part of the IT, part of the business, we'll talk a little bit about that. Then we'll get into data management. And very briefly, those of you that are familiar with the DIMBOK, this will be a repeat for you, but we now have an objective definition of what data is. And we can talk about the parts of data management, various practice areas is what they're officially called. And we'll talk about specifically why it's important, and then I will tell you guys about the state of the practice. That's usually one of the more interesting highlights on this, because when I talk to people at events like DGIQ and EDW, most people say, well, you know, I really don't want to tell you this, but we don't do this very well. And I say, that's okay, nobody does. And they go, oh, the question is, though, if nobody's doing it well, somebody to do this well actually is a really good opportunity for them to attain a sustained competitive advantage. Then we'll switch over to data strategy, because you need to know both. And again, we'll talk about a structural approach to data strategy, because there are some issues with properly implementing a data strategy. One thing I will stress without invoking the name Steve Jobs to, you know, for Lauren Lee, the need for simplicity around strategy is absolutely key. I can't tell you how many companies I've gone into, and they'll hand me a 100-page data strategy. It's like, no, I'm sorry, that just doesn't work. We'll talk about why that is, and we'll talk about some foundational prerequisites that we need to have in order to do this. And then I'll introduce something at the very end, which is what I base my data strategy on. It's a theory called the theory of constraints. And this is how you use the theory of constraints to improve your data in the organization. Finally, some takeaways as we approach the Q&A section on this. I started to do in action, and then I crossed it out and said, no, really what it should be is in concert. So there's a bit of a difference there, and I'll stress that, and we'll talk about coordination as an absolutely necessary prerequisite. So let's dive in and start talking about context on this. First of all, there's been a lot of confusion for a long time, and I say a lot in a long time. Let me just place that in context. We've been doing data as a formal discipline for about 150 years. If you contrast that with the accounting profession, the accounting profession has been around. We've been selling beer to each other for at least 8,000 years. So when you compare the two of those, it does make a difference. We don't have the seasoning, the experience, the ability to have developed what are now called generally accepted accounting practices. It's going to take us some time to get there, and we're working on it, but that's a different issue on that all the way around. The real bit of this as it comes to the head is that there's confusion. Most people think that data is a business problem if you're in IT. The thing that we hear a lot is, well, if they can connect to the server, my job is done. So there in IT thinking is, as long as I can connect you up with your source of data, the business people can take care of the data part really, really well. However, the business also thinks that IT is managing data. So you have this sort of everybody pointing at everybody else and saying, it's your job, no, it's your job. Now, from the business perspective, there is a CIO in most organizations, Chief Information Officer. What else would that person be doing? Well, if you want to get into a little detail, there's about 1,000 things that the CIO does that have nothing to do with data, and they have enormous jobs, enormous responsibility, but data is just a fraction of what they're doing, and that has led us to the situation that we're in, which is that data has fallen into this enormous chasm between IT and the business. And our collective goal here at Data Blueprint is to repair that chasm to put people in a situation where they can reestablish the good partnerships that should exist between IT and the business and therefore start leveraging your data in a more productive fashion. Now, I'm going to move on here a little bit and talk two things about data. These are points that most people are really not aware of. The first one is that it's really important in the data world to separate the wheat from the chaff. And the reason for that is two-fold. First of all, everybody, I think, will agree that if your data is better organized, it is more valuable. If you have a knowledge worker that's looking for a piece of data and it takes them any time to find it beyond zero, then that is time that your organization is spending where it shouldn't really be spending that money. You want your knowledge workers to have everything they have at their fingertips so they can be as productive as possible for you. So the corollary to that is quite clear. Poor data management practices cost organizations a lot of time, a lot of money, and a lot of effort. So better organized data is just a generally good thing for organizations, but as I said, we don't have these generally accepted data management practices that we need to put in place the way the accounting profession has put them in place. But there's something worse about data as well, and that is that minimally in your organization, 80% of the data that's there is rot. Now, that usually gets me tossed out. People say, why would I invite somebody here if you're going to make insulting remarks about my data? Well, rot's actually an acronym that stands for data that is redundant, obsolete, or trivial. And actually, my wife corrected me on that. She said, no, it's actually riot. It's redundant, incomplete, obsolete, or trivial, but I've already put rot out there. And if I call your data a riot, I'm going to be in even more trouble than I am for just calling your data rot. The only argument that I get over this figure is that the number is much higher than 80%. So I've had some organizations telling me, in particular, that 95% to 99% of their data is rot. Redundant, obsolete, or trivial, you can see, even if you get the right piece of data, is it the right age? Did it come from the right system? Is it the ultimate source of truth, as we're looking to answer the business questions that we have? So the question that comes up is, which data in your organization do you need to eliminate? And the answer is, well, generally quite a lot of it, but don't go saying Peter said throw away all your data. That's not what I'm saying. So sounds on one hand like data is not as much fun, but we don't have a really good asset there. But when we look at it from another perspective, data is the most powerful, underutilized, and poorly managed asset that we have in our organizations. It's the only asset that we have that is non-depletable. You cannot use it up unlike your fiscal assets, which can be used up quite easily. Unlike your people assets, who can be worked to death literally, the data assets are not degradable. They do not degrade over time. I mentioned to Shannon earlier, we've just moved Data Blooper into its new world headquarters. We just changed offices. We like to call it a highfalutin sort of thing there. And this building will degrade over time. We know pretty much it's probably not going to do at the time we're in here, but that's what happens to those type of assets. Third interesting characteristic about data is it is a durable asset from a fiscal perspective. Now when I say fiscal, I mean, people consider durable assets to be assets that you invest in so that you can make more money on them. Or if we're in a nonprofit world, or some of you guys are from state and federal government, we also have a mission that we talk about instead of actually just making money. So these three characteristics, non-depletable, non-degrading, and durability, make data a unique strategic asset. And when you compare them to other assets, data assets really come out quite big winners all the way around. Now I was recently over in the Middle East, and they were real excited about this. They think data is the new oil. And even just as late as yesterday, I was scanning, I think it was The Economist, there was another article about data being the new oil. I cringe when I hear this, because if you think about data as oil, you think about it in the same way that we think about Ethernet service, or gasoline, or other commodity-based products. This is not something that you plug into the wall and get the right stuff on, because I told you just on a random walk, you've only got a 20% chance of getting the right data for your situation anyway. So thinking about data as the new oil, and if you Google this term the last time I looked it had over 5 million hits, it is wrong to think about data as oil. Because we don't think about what happens to data when we put it in our cars. We plug it in, we use it up, and that's it. We're done with it. Data's best use is not when it's used, but when it's reused. And if we don't design that reuse into the data, we have no ability to actually do something with it. If you just take a quick deviation here, this is the major problem that we have with data science today. While we've all been working very happily on trying to get data science as a better career field, it's too vague at this point. And people simply, the way most data scientists work in organizations as they go, they grab a bunch of data, they sit down, they play with it for a while, they make it do what they want it to, and it comes out and they've solved exactly one problem for one application. But they can't take those lessons that they've learned about correcting the data, but how it's used, all the metadata they now understand, and put that back into production where it's going to be of much more value. So okay, Peter, you don't like data as the new oil. We got that. What do you want to call it? Well, I just changed one letter. Instead of calling it the new oil, let's call it the new soil. And the new soil is a better metaphor for it because we know two things about soil. Excellent, back up. If you're a gardener, you know two things about soil. One, you don't just take your seeds and go randomly sprinkle them about the yard and hope that you'll get hand-over tomatoes growing. It doesn't work that way. You prepare the plot of ground. It has to have a good amount of cultivation. You've got to have the right soil, the right people. And again, just sprinkling seeds on the ground probably won't lead you to be able to, you know, get the zucchinis that you're looking for on that. The other part of this is you don't plant things on Monday and expect to eat them on Friday. It takes a season to grow them. And that season of maturing is a process that data also needs to go through. You can't just turn around and buy some technology, slap it on top of your data, and get it to go. So I had a very good quote from Michelin Casey in the data strategy book that Shannon referenced where Michelin had gone in to talk to a CIO. And he said, well, Michelin, last year, you did this, you told me I needed to organize my data better. So I went and bought a data warehouse. It's no better. I've got now really bad data in the data warehouse. That's not good. Again, wrong preparation, wrong growing time. We need to do something better. On the other hand, if you want to call it bacon and get it to work that way, that's okay too. We're all fine with that. So with data thinking a little bit differently, as our soul, non-depletable, non-degrading, durable strategic asset, data deserves its own strategy. Having a strategy around data is critically important because if you don't, all of your knowledge workers will do what they think is best. There's nothing wrong with your knowledge workers doing what they do best, but have any of your knowledge workers had any explicit training and education around data. The answer I can tell you is probably not, with a couple of really good exceptions. And so having a strategy for your data means all the knowledge workers can be working towards the same goal. Another piece that data deserves is it deserves attention on par with other organizational assets. We do a lot of work for the federal government and parts of the military. And I can tell you that the military knows where large things are. There are guns, there are tanks, there are ships, all sorts of things that they have, and they know where they are. We do a lot of work for trucking companies. And the trucking companies know where their trucks are. These are assets. They know where their drivers are. If you're in a medical profession, you know where your doctors are. They are a primary asset that you have in there. You have people that manage the doctors. There's somebody that keeps track of the trucks and tanks. Virtually nobody in these organizations is paying attention to where your data sources are. And that requires, again, somebody to be doing it. Of course, if it's everybody's job, it's nobody's job. And, of course, that's a problem as well. Final characteristic on this is that almost nobody has data in the shape they'd like to have it. And consequently, it's not in as good a shape. Again, I told you at DGIQ, where we just got back from last month, where people would come up to me in the hallways and go, well, you know, my data's not really very... but if you work at it, you can get it to be cleaner. And that's what we're trying to do. So I say professional administration is required to make up for past neglect. Now, let me just show you a little bit here. This is a piece from the Commonwealth of Virginia. It's a piece of legislation that was done by our previous governor, Terry McAuliffe. And what he did is he said, my goodness, the governor of the state of Virginia says it's important for Virginia as a state to increase the use of shared data and analytics among Virginia agencies. He's got a comprehensive plan doing all sorts of good things. He also says we need to make sure that we balance those needs with individuals' privacy interests. Yes, absolutely. We don't want the government doing anything with our data that they shouldn't be doing, but if it's not organized, it's really hard to tell what's being done. If you can't manage it, how are you going to secure it? And finally here, he says in his executive order, he's going to generate a common data sharing lexicon and terminology to eliminate the friction and confusion among the various state agencies. Well, it's fine for us to say Virginia doesn't have very good, by the way, you know, this one is unsigned. He actually did sign the one. I just don't have a copy of the signed one on this. But it's great to say, you know, Virginia doesn't have it. The state government, what do they know? Well, my measurements show that state governments and governments in general are actually slightly ahead of the private sector. So those of you that are in the private sector don't go feeling like you're in really bad shape. You're as good as the government. The government's a little bit better than you are in this case. And I love the way they characterize the last part of this. Bad data is friction in your business processes. When somebody is trying to do their job and the data doesn't respond the way they want it to, it's very, very difficult for them to come along and continue to do their process. It's an interrupt the same way as your stupid email comes up and goes, you've got mail when you're in the middle of big concentration pieces. This was important enough for the Commonwealth of Virginia to say, we need to do this executive order. You can see it was done in 2016. And I can tell you here two years later, the Commonwealth of Virginia is now recruiting for a chief data officer. By the way, that's a job offer. If anybody is interested, they need to get in touch with me right away. And I can put them in touch with the people who are interested in that. I have no idea what the process is, but I know that it's going to be an appointment. So, you know, there's probably a wreck out there somewhere for its own mess. So, what this means is that for most organizations, whether they are private sector or public sector, they have little idea what data they have as assets. They don't know where it is, and they don't know what their knowledge workers are doing with it. And that's a situation that puts all of our organizations at quite a bit of risk. Now, I was very fortunate to go to India last year, and this phrase at the top that's bouncing at you, quality engineering architecture work products do not happen accidentally, is from a sign that I saw over the cash register at a tea farm in India. So, even though I'm on vacation in a foreign country where they don't speak English, over the cash register they had a very, very poignant piece. You can't do the kind of work that you need to on an engineering scale by accident. It does not happen. Data management works pretty well at the work group level. In fact, it is a defining characteristic of a work group. A group of people who exchange information is one of the definitions of what a work group does. The problem is who's giving guidance to all the work groups, and the answer is for most organizations, they're making it up as they go along. They're well-intentioned. They have absolutely the organization's best interests in mind, but remember, we've never taught them anything about this. And if we've never taught them anything about this, how can we expect that they're going to be doing the right thing? The analogy that I like to use on this is let's pretend that you were doing your job every day, and somebody came along and said, hey, I've got something that will make your job a little bit easier. I say, oh, okay, what is that? I say, well, it's shoes. You say shoes. What do they have to do? Well, you know, you're walking around now. You've got some calluses on your feet, but if you use shoes, you can go a little further without your feet hurting. You probably can avoid, sorry, get into some maybe riskier areas. If there's a piece of glass, you would go around it if you're barefooted, but if you have shoes on, you might step on the glass and risk cutting your foot. This is an issue, and if we don't have that ability to talk about how data can help in a situation, then people don't tend to understand what it is they're supposed to do. Without guidance, what are the chances that all the work groups are pulling us towards the same objectives? The answer is there may be some luck involved in there, but it's generally not going to happen by accident. Consider the time spent when attempting to mediate informal practice. I've got a word missing on that slide, but they mediate informal practices on there. The data chast becomes the sand, and it prevents smooth interoperation and exchanges between your organization and other organizations, between parts of your organization. It's a really good thing. Data is sand. Okay, maybe. Well, I said soil, so I think soil is still a better type of thing. Another thing that we talked to, too, is that many organizations, even let's just say that we had everybody in the organization that spent five minutes a day looking for a piece of data that they should have had at their fingertips. If you add up all those five minutes a day, that's a lot, and it starts to add up very much. I did a justification for a metadata repository for one organization on the sole basis that we saved everybody in IT an hour a year. One hour per year. It's a pretty low threshold, and we actually did come up with the numbers that were very good on that. So organizations and individuals lack the knowledge. They lack the skills. They lack the data management know-how on this. And data management is the how, and the strategy is the why. So that's what we're going to get into now. Let's go a little bit further and take a look at what data management is. Now, as I mentioned before, data management was really not well understood until fairly recently. Many people thought there was a librarian function, and I will be the absolute first person to tell you that we have neglected our colleagues in the library science areas who know a lot about this and have in fact been studying it since the time of the Egyptians. That's not formal data management practices, but they've been still working on things. So we're not like brands over in the corner, but that's how some people think of us. And then, of course, there's the people who think, if you just model everything, it'll go bad. This is a far side, if you remember those, and obviously if you label the dog and the cat, they're still not going to be able to speak to you on that. But hey, let's just label everything, right? So what is the purpose of data management? Or this is an actual old Microsoft commercial. You can see it's 2005, and I'm not sure what Microsoft was trying to tell us here, but it doesn't really paint us in a good light when we're doing data modeling in this fashion. Although I will say I'm very proud of the fact that the new office has wonderful windows, and we now look like Chicago O'Hare Airport with the data models up there on the windows with a great little piece of advertising there. So what is data management? Let's take a look. This is the Wikipedia definition here. It's not a bad place to start. We give an organizational definition from Dama. It's the development and execution of architecture's policies, practices, procedures, that properly manage a full data lifecycle needs of the enterprise. Okay, that's nice. I'm not sure what that means is really usually the first piece that we get. The broad definition is it has no real technical content in this. We're talking about a business set of practices, and that's one of the first definitions that we want to talk about. The reason this is really problematic, though, is because when we teach people in schools how to do this, they usually get one course on how to build a database. Now, if there's a skill we do not need any more of on planet Earth, it is how to build more databases. Of course, you've probably heard already that if the only tool you know how to use is a hammer, every problem tends to look like a nail. And so it's no wonder that when we sent these kids out of school, they look around at the business problem, and instead of integrating two databases, they come back and say, oh, no, no, no, we'll just build you a brand new one. Now, if the decision makers don't know, and if the students don't know, how are we ever going to make progress here? Even worse though, this is an actual problem that we have students solve in class today. I don't mean we as Data Blueprint, but many colleges and universities, as I've traveled the United States and the rest of the world, still are using this textbook. So the example is calculate the access time for a disk with a sector size of this and a advertised seek time and the disk rotates here and you can do some math. You know what the problem with this is? Two-fold. One, the disks no longer rotate. When you have a faculty member that shows up and says, I want you to calculate a disk rotation time and they haven't actually done that and they don't know what a disk is, immediately they turn off to the class because they say, oh my gosh, the professor here has no clue what they're talking about. This is not what we should be teaching people. We have figured out how to do this a lot better and it's not about calculating these technical things. That can be automated. Instead, data has a much broader focus. Data is brought in in software architecture, which you can see here, focuses in on a program or a set of programs. We call it a family of programs sometimes. And even a database exercise is only on a set of them. So you may have somebody doing the gray data and somebody else doing the green and somebody else doing the orange. Well, that's great. But who's going to tell you if the green database is using some of the data that's in the gray database? And if you don't have coordination at that level, nothing is going to happen. Some, many DBAs will actually attempt to do this from a bottom-up perspective. But by the time you squeeze this into a project, there's no time and money to do any project work. So we just live with the legacy and move onwards. Excuse me. The analysis scope then is very broad in these data areas. And it doesn't really focus on the problems that are caused by data exchange or interface. Again, it doesn't matter whether we are interfacing internally or whether we're cooperating in a much larger fashion around all this. And the architecture goals have to be more strategic than they are operational. So our data management focus here is on an organization-wide use of it. The requirement is to understand. And I put that in quotes because the understanding part really means both by machines as well as by humans. So we need to make sure that everybody understands this. It doesn't do any good. If two humans agree and the machine disagrees, the machine is going to win in the long run. It also involves looking into the future, trying to find what are the current as well as the future needs of the organization. I don't know about you guys. I can't predict the future. If I could, I would definitely be doing something else. But at least for the moment, we can't predict the future. But what we can do is say, is it possible that at some point in the future this requirement might be important to incorporate into our systems? And if the answer to that is yes, data managers who are qualified in this area can produce models that are more adaptable or flexible and less risky to the organization in order to do that. And finally, I've already said many organizations have this thing of death by a thousand cuts, where we want to make things more efficient and effective even if it's a tiny bit. I had one organization that had a query that ran more than a billion times in a day. Well, if I save a second out of a billion times, that adds up to a lot of seconds, a lot of machine time, and in fact a lot of hardware that was redundant and did not need to be worked in there. We tried to do data in a way that will leverage data to support other organizational activities. And we'll spend just a minute here on this thing in the bottom corner because it's a very good illustration of what we mean by leveraging data. We have some data, and we also have some technologies that we apply now. The first version of this is going to be a lever. We'll just pretend the thing on the left, the circles, by the way, for those of you that are young, that's the way we used to represent a hard disk in our flow charts. And of course you wouldn't know that because the hard disks don't rotate anymore, but nevertheless it's still a challenge around that whole thing. So to move this thing, if we're going to move it, is we have a technology. In fact there's two of them. There's a lever, which is the lever that people are pulling on, and a full crumb, which is the piece that you put on the lever to give you the leverage. You could move that thing by sticking the lever underneath it and lifting up. If you like hernias and back injuries, it's a great way to go ruin the rest of your life. But instead what we have developed is a process. The process is people try to apply pressure to the lever by pulling it down, which is a lot easier than trying to lift it up. And the technology of the full crumb and the lever combined will help us to move this. So we have people, process, and technologies in order to do this. Notice I've also incorporated ROT on this diagram. If you reduce the ROT that you have, the need for leverage is considerably less. So let's take a different definition of data management. It came from an article we published in 2007. If you're interested in the article, it talks about some of these numbers that I've been giving you here around all this. It's understanding the current and future data needs of an organization and making that data effective and efficient to support the business activities. That's generally something people can understand. So I'm going to just go back for a minute and contrast that with the Wikipedia definition. Now, don't worry, we could change Wikipedia, but we have to get agreement on that. So again, the development and execution of architecture's policies, procedures, practices that properly manage the full data lifecycle of an enterprise or understanding the current and future needs of an enterprise and making that data effective and efficient in supporting the business activities. I like the latter definition, not because I came up with it. I actually came up with it from Burke Parker, who's the second author or third author on this article. But I think it's a much better definition. So again, we don't teach knowledge workers about data. And yet what's the definition of a knowledge worker? Somebody who uses data. That's a huge source of untapped productivity in our organizations. And what do we teach them about this? We give them one course on how to build a new database. And the problem is that most people don't understand the role that data plays in here. So I already mentioned the hammer and the nail problem as well. Shannon mentioned at the top, we're doing this in conjunction with Data International. I'm the current president of Data International. And this is one of our proudest accomplishments. Back in 2009, we published the very first definition of what it means to do data management. Now I will say that this piece is the first step towards a good definition. It is not the ultimate definition. In particular, this diagram is missing two things that are critically important to data people. Optionality and dependencies. For example, somebody could look at this chart and say, oh, your chart says I must do data warehousing. No, actually, the chart says data warehousing is a part of data management. And if it's useful, you should apply it. But it's not obvious from this chart. So we've been racking our brains trying to come up with a better way of showing that it's probably a good idea to start off with governance. At least we got that part in the center. And then maybe working our way out. In fact, when you look at data management in general, trying to do this by silos is really a bad idea. What we find works better is by taking at least three of these pieces together in order to come up with any specific business solution. So again, many organizations will just buy a tool to do data quality, right? Well, if you don't have governance around that tool, it's very unlikely that the tools are going to make a significant difference to the business. Now, I mentioned this was 2009. We've updated that. This is now the new version. The Dembock version two, which is out and available for sale on amazon.com, plug for Dama and Amazon and Technic Us Publishing Company, our wonderful publishing partner, Steve Hoverman, there. And the difference here, you can see we simplified the chart slightly. And the one thing we added was another pie wedge, thinking that data integration and interoperability was important enough to be incorporated into all of this. So when I was going to school, I was taught that data was database design and operation. We got a little smarter than that. We moved into that. And maybe it has to do with requirement analysis and data modeling and things like that. And in fact, during the ops, we actually got up to saying stewardship and data usage and things like that. And it's about 2,000. We've incorporated a lot more into this. All of these are expanding the scope of data management. But right now, the standard college and university curriculum is woefully inadequate for preparing our knowledge workers for the 21st century, which is a huge, huge problem. Let's go a little further now and talk data strategy in here. So data strategy, again, we're going to talk specifically about structural pieces. Now, we do a lot of consulting work around data strategy and helping organizations do data strategy for the various organizations that we work with. And the top three pieces of data strategy that I get are, we're going to do data science, we're going to do big data, or we're going to do analytics. Well, the problem is those three terms are redefining. And I'm sorry if I blew your speakers out by doing that, but it irritates me when people use words that mean anything, which means they really mean nothing. Data science is a vacuous term. Eric Siegel, the guy that's credited with inventing it, said no. Calling somebody a data scientist is like calling somebody a book librarian. It doesn't mean anything. Calling something big data is ridiculous. It's just so silly. There are things you can do with data that are new, but calling them big, you might as well call them green, or blue, or soft. It just doesn't make any sense at all. And nobody knows what the word analytics means. It means data analysis. So if your strategy is to do data analysis, I wonder what you were doing before. Other top strategies that I've encountered in the last 30 years are SAP is going to be our platform. Now, SAP is a fine platform. Don't get me wrong. I'm not picking on SAP Microsoft Google or Amazon Web Services, but they are technology pieces. And remember, going back to our people process and technology, that only fills the technology piece. And each of these platforms has their own advantages and disadvantages in there, but none of them in their innovative cells is a strategy. Another really important part of strategy is that most people think strategy works like this. We have an organizational strategy. We have an IT strategy, and therefore the data strategy comes out of it as well. And I'm sorry, but again, this is so incorrect that it is just cringe-worthy. The proper way to do this is to look at IT strategy as being parallel to your data strategy. And the data strategy actually should guide your IT. Yes, of course, there's a good feedback loop in there in order to do this. But nine out of 10 organizations that we work with did not start off by doing this originally, and that becomes a huge, huge problem for them. So let's talk about, for a minute, just what is strategy? There's a great talk out there, a TED talk out there. I've got the URL up there on the screen for you. Simon Sinek does a really good job of saying, you know, we're pretty good as human beings at describing what we do. We're less competent at describing how we do it. Try to do a flowchart for making a peanut butter and jelly. Right? And we suck at doing the why. We just, it's not something that's part of it. Imagine, again, it's not what you do, but it's why you do it. All right? Martin Luther King's speech, I have a dream speech. Right? You think we would have had the same impact if he said, ladies and gentlemen, I have a plan? No. No, it's the passion. That's what it is. Strategy tells you why you do things. Now, my favorite definition of strategy is, one, that comes from the military. It's a pattern in a stream of decisions. That is something I can use and operationalize. And let me give you three very quick examples on how that works. First one is Napoleon. And you're thinking, wait a minute, I came here to get some technical advice. Why the heck are we talking about Napoleon? Well, this is absolutely literally textbook definition of strategy. So the little story here is that Napoleon was facing two armies, the red one and the black one. The red one is the British and the black one are the Prussians. And they were bigger than he is. So he had to win. How do you win when the competition is bigger than you? The answer, of course, we all know, is divide and conquer. So let's take a look at the battlefield that Napoleon was looking at. First of all, you've heard the phrase an army marches on its stomach. But what that means is if the army has to run away or change anything, they're going to run towards their food, not away from it. Just a human thing. So the British army, Napoleon noticed, was supplied out of Ostend on the coast of Belgium up there. And the Prussian army was supplied out of Lige, which meant that if Napoleon was able to hit them in exactly the right place, it's not just hit them, but hit them at that, think of it like a pool shot, right? You're trying to get the two balls to jump into the corner pocket. If I hit them right, they will retreat. And if they retreat, I know that the red are going to go to Ostend and the black is going to go to Lige. Now that's only part one of the strategy. This is a very complicated strategy. You're thinking to yourself, how complicated can this be? Well, remember, we're sitting here in nice comfort. The soldiers out there in the battlefield were going to die. They had people shooting at them. So they had to remember a couple of things. One, hit them exactly at that position. Again, you can see the pool shot that you're trying to get there. I need to hit them so that both armies will move back in the proper place, and when they move back, they will separate. When they separate, part two of the strategy comes into play. First, go after the Prussians, and then go after the British. If a soldier on the battlefield gets that wrong and tries to go after the British, that soldier is going to be killed because that soldier will be the only one who's going after the British where it's everybody else who's going after the Prussians. So again, our strategy hits them exactly right there so that they split apart, and then first we go after the Prussians and then second we go after the British. That is a complicated strategy for a soldier under live ammunition conditions. Take another definition of strategy. It's a pretty simple one. Wayne Gretzky, everybody knows Wayne Gretzky. Wayne Gretzky's definition of strategy was very simple. He skates to where he thinks the puck will be. You might ask, well, that's not so hard. It's not so hard, although being Wayne Gretzky was clearly not something everybody else could do, and they all tried. So Gretzky said, look, I could follow the puck around, but the puck is really fast and it's really small, and I'm a person and I've got to chase it. If I go to where I think the puck will be, the puck might show up there sometimes, and I can become the greatest scorer in hockey history. Finally, I'm going to give you Walmart's business strategy. Again, this is not a secret. Walmart had this for many, many years. It's very simple. Most of you know it already. Every day, low price. Now, why is that important? Well, everybody at Walmart understands this mantra. If you're a supplier, a contractor, an associate, a leader, even somebody who writes about Walmart, you understand that every decision when they're looking at it is looking at it from the idea of if I solve this for the lowest price to the customer, I'll probably won't get in trouble from Walmart. So three very simple definitions of strategy. Anybody that shows you a 100-page strategy, there is not a chance that anybody is going to pay any attention to this at all. Again, one of my rules is I take the organizational strategy, and I ask people how many of them even know what the organizational strategy is, but I can tell you if the data strategy weighs more than the organizational strategy, it ain't going to work. Let me give you a very specific example of this. In most organizations, when they talk about strategy, they don't realize how simple strategy needs to be. Michael Porter did a great job of articulating this in the 90s when he did his pioneering work at Harvard on this. There are only two dimensions of strategy. One is to improve your existing operations. The second one is to be innovative. There are no others. So what that means is if we have no formal strategy, any answer might be correct. That's not really a good way to run your organization, so we'll skip that particular alternative. But let's hold out Walmart, as most people do, as the paragon of organizational efficiency and effectiveness. Walmart has got some great people who have been able to optimize over the years and know how to make things efficient and effective. That's wonderful, and it works very, very well for Walmart. We'll pretend Apple is an innovative company as well. What they do up in Apple is they try to create innovation. By the way, what was one of Apple's great innovations? Well, they have one thing that they liked about their phones. There was only one button. If you hit one button, you couldn't hit the wrong one. But what was Apple's big strategy for years and years? Get down to zero buttons. Again, you can see strategy can be really, really hard. Now, I want you to just imagine that you take the people at Apple who are really innovative and trying to get down to zero buttons and tell them, no, no, no, you've got to be efficient and effective. Their heads will explode. Same thing if you take the Walmart people who are really good at being efficient and effective and say, now I want you to be innovative and they go, what? I don't get that. Of course, the other part of it I get is that many people say, well, we're a really smart organization. We can do both of these things at the same time. And once again, I say, no, that's really not a good idea. Instead, what I'd really prefer to do is to say, take the savings that you get by being increased efficiency and effectiveness and run them back into your organization to fund the innovation pieces that you do. I'm just going to go back over that slide real quick here. Hang on. There we go. Again, don't try to do both. Don't do neither. The answer's somewhere in the middle, but pick one. Take the money that you save from effectiveness and efficiency and use that to fund your innovation pieces. So hopefully you get the sense now. Strategy is a pattern in a stream of decisions. And this is critical because most organizations don't kind of get the difference between what's going on in IT. If we try to build data as part of an IT project, IT is very, very good at implementing new capabilities. They can create this ability to come up with new capabilities in a very, very cost effective fashion. IT has gotten really good at that. We've practiced a lot, we've learned, but data is not the same thing. And this is something that we have to take into account. Data evolves over time. It is not a matter of creation. Data evolution needs to be separated from, made external to, and precede the system development activities. If you don't do this, you can only get so far in your project. Data, of course, is not a project. It has to be managed at a programmatic level. We're doing one more piece on this thing here. Again, the key here is that if you understand foundational practices, you can then build on them. And I'm showing you a house. This is actually my house in Montpelier, Virginia. I'm what's called a horse husband, which meant that when I was going to invite my future wife to come share the rest of my life together with us. Right? You know, like sort of good stuff. I don't know, Cassie's not listening at this point. The barn was part of the deal. Now, the bank, interestingly enough, came along and said, great, we're going to give you some money to build this. And they gave me exactly this much money. What did they do? Well, the bank's pretty smart. They said, before we let you build the barn, we want to make sure that your foundation is of sufficient strength. If you do not have a good foundation, you cannot have a good barn. So in Hanover County, Virginia, where I live, somebody came out from the county and inspected this and said, yes, you have passed your foundation inspection. Now you can proceed with the project. Think about it. That was the opposite way. If I built the barn and put it on a poor foundation, from the bank's perspective, I'm probably going to spend more money in vet bills than I am actually in paying them back on their loan. They wanted their loan to get paid back, so it's in their interest and my interest to make sure the barn is ready to do this. However, there is no IT equivalent of this. And this is a huge, huge problem. All of a sudden, we somehow get to Maslow's hierarchy of needs. Most people remember Maslow from high school. Maslow's big insight was if our food, clothing, and shelter needs are on that, then we will never be safe. And if we are never safe, then we will never be part of something that is bigger. If we're not part of something that's bigger than ourselves, then we will never have self-esteem. That is the definition of self-esteem. You recognize yourself among parts of the larger piece. And that will never get us to what everybody wants to get to, which is self-actualization. Well, they call this flow now in terms of, you know, pop psychology that's there. But it's all back to Maslow on this. And it turns out data is exactly the same thing. There's all this wonderful stuff that people want to do with data, master data management, data mining. If I was going to update this slide, which I purposely haven't done, because these buzzwords change so fast, I would add two terms to this right now, Bitcoin and blockchain. Woo-hoo! They're going to solve all our problems today, right? You just put blockchain in one of your presentations, and you've got a million people showing up for it. Well, blockchain and MDM and Big Day, these are all good technologies. But remember, we need people, process, and technologies in order to make this work correctly. And we have the same exact situation that we had with my barn. There are five data management foundational practices that must be practiced at a good level in order to do this. Now, I'll show you what the levels are in just a minute. But these are capabilities. These capabilities are important for your organization to learn how to do. Remember, you've got dozens, hundreds, thousands of knowledge workers out there who are trying to manage data on their own without guidance. And they're doing the best job they can, and it's working well at the workgroup level. But as far as your organization goes, everybody pretty much isn't very good at this. Now, I always get calls at data blueprint, though, where people will say to us, well, Peter, I heard you say that, but I need it done by Friday. Can you do it any faster for me? And the answer is yes, we can speed it up. But if we speed it up, it will take longer, it will cost more, it will deliver less, and it will present greater risk to your organization than if you did it the proper way, which is build the foundation first and then put things on top of that foundation. Last piece of all this, we need to understand, too, the structure around this. So those five areas have now turned into five little balls. Thank you, CMMI Institute. This is research that we funded out of the Defense Department a long time ago that says that these foundational practices around data management are managing the data coherently, making sure that we have professional asset class of managers that can do this well, making sure our data is fit for purpose using the right technology and the right processes in order to do this. Each of these can be evaluated on a five-point scale. I'm going to give it to you very quickly. We did this in another webinar. Remember, these things are all archived, so you can go back and look it up. Excuse me. You get one point for having a pulse. I'm not sure I had a pulse there for just a second, but I'm all right now. You get two points if what you do is repeatable. So somebody who's good at it, maybe we hand that particular work over to Lisa or John. If there's any documentation at all, you get three points. If there's any measurement around the processes that are done at all, you get four points. And if you improve your existing piece, you get five points. I just gave you a whole webinar in two seconds there. You can go back and look at that other one or ask some questions later on. But what this implies is that our architecture for data management is dependent on the weak link. A foundation can only be as strong as the weak link. I showed you all those bricks on my barn. Well, if one of those bricks is made of marshmallow, first rain, it's going to wash away, and we're not going to have anything that's very useful. Let me give you a more practical example here, though. If we've evaluated these organization practices at level three for each of those, which means they are not great, but they're good. They have documented processes around data quality governance, platform, and operations. But that their strategy is at a low level, a one. What this means is that they can put a million or a billion dollars into data quality, and nothing will get better without that strategy. Your foundation is only as strong as the weakest link. And this is really what data strategy is all about. This is what I mentioned earlier is the theory of constraints. L.H. Goldratt wrote this book a long time ago. We use it in classes still today, even though it's old. What it says is, just what I said before, the practice is only as strong as the weakest link. So if you look in the left-hand corner of this diagram, the person there at number 10, right, work in progress, has a lot of work in progress, and two people behind it are just waiting around for something to do, because it's a sequential process. There is constraints, and it's very simple. It says that there is in your entire chain of things that are happening, some people call it a value chain, but it can only be as strong as the weakest link as I showed you on the previous slide. But worse than that, if I don't fix that link, then anything else that I'm doing is going to be of lesser value. So the theory of constraints operates very straightforward like this. Oh, gosh. Yeah, and if you remember this, that's why I put the little clues on there. There's actually a movie on the theory of constraints. You can go buy it. That's Alex Rogan from our history here, and he worked at UNICO. Okay, many people remember that. Anyway, theory of constraints. Identify what is keeping you from using data in your organization in the best, most efficient format. Number two, when I say explode the constraint, what that really means is find out if you can quickly and easily fix the problem. We call it low-hanging fruit in many instances. Then look at everything else that's going on when you made that change and see if anything else is affected there. If they are, we still need to elevate that constraint. If we haven't fixed it, we need to repeat the process until we go back and fix it. So what does this mean from a data perspective? Well, let's twirl it for a minute there and then come back. There we go. Again, what you're doing is improving your data. So your analysis of your organization says that this thing is the most important thing, blocking your organization from successfully employing data in support of the organizational strategy. Try to fix it without restructuring. Correct it operationally, in other words, is what we use as a shorthand on that. Make sure that all your data evolution activities are focusing on one specific objective. Again, I can't tell you how many organizations where I go into, and they've got a master data manager going over here, and they've got data quality going over here, and they've got something else going on over here, and oh, by the way, let's not forget document and content management. Man, it's too much. People can't do that. Elevate that constraint. Make sure that you restructure. If you can't correct it operational, then you have to correct it from a structure perspective, and keep going until the data better supports the strategy. If we don't get that cycle going in there, then nothing else can happen, and now you can start to get into what we like to do in data strategy, which is get it down to a cycle where we do ladder, rinse, and repeat. So we're almost at the end here. Let's take a look at how some of these things work together. And I mentioned at the very beginning, I started to use inaction, but then I'm going to use the word in concert. So hopefully you've seen already that business decision makers are simply not knowledgeable about data. It's not their fault. We haven't taught them. Unfortunately, technology people are also not data knowledgeable. Even those that are going through data science programs are not taught anything about data management. They are taught all kinds of algorithms, but they are not taught about data. So we have a combination of business decision makers and technical decision makers making bad data decisions. And those bad data decisions result in poor treatment of organizational assets and poor quality data, which leads to poor organizational outcomes. But of course the only outcome that anybody sees is the organizational outcome. They don't understand that data is death by a thousand cuts. I can take any IT budget and cut 20 to 40% out of the total IT budget by improving your data management practices. I say I. I say that me that we got a whole team of here that can do that sort of thing. Let's take a look at strategy and context then. If we look at what's going on, we have an organizational strategy and the data should be supporting that organizational strategy. If it's not supporting the organizational strategy, management has a reasonable question, what are we doing then? So the data strategy is there and data strategy tells data management what the data assets need to do to support the data strategy. And data management then gives us some feedback and says how well is that occurring? Now remember in Peter's world, you want to keep this IT projects are literally subordinated to data management activities. There aren't many organizations that are willing to take that step. It's a huge change in organizations, but it is a very necessary one. And then we can fill out the rest of this by adding a couple of feedback loops and other things on here to take a look at it. I wouldn't show anybody the whole chart. I'd just show them the whole data strategy organizational strategy piece in there. So we're getting close to the end. Four minutes left here and I want to go through a couple of takeaways. Again, this discipline, data management, data strategy has not had 8,000 years to formalize its practices. In the accounting world, we talk about generally accepted accounting principles. And let me give you a quick anecdote here on how important that is. Many of you remember the Equifax data breach last year where they lost all of those stupid little questions that we answer when we go sign on to websites. By the way, that was one of Equifax's main lines in business. All of those questions are now in the hands of hackers so they know who your high school teacher was and what your favorite color is and all those sort of things. Change those things right away if you haven't done that. Generally accepted gap practices is important. And it turns out when Equifax got attention from their data breach, the news media also uncovered the fact that they were one of about 20% of the companies in the United States that didn't practice generally accepted accounting principles. And consequently, their stock took an additional hit, not just for being knuckleheads about losing all of our data, but instead probably not following correct accounting practices either. Second, this is hard. You can say Peter said it. Your data is a mess. Excellent timing there, John. That was good, right? You can make that crack there. All right. Your data is a mess. I'm sorry. Your data is a mess. And I don't mean your data. You as a knowledge worker probably have your data pretty good. And remember, your work group probably has a pretty good set of data that they use, too. But organizationally, your data is a mess. I've worked with well over 300 organizations in the 33 years I've been doing this and there are a handful of them who does not fall into that category. Your folks don't know how to use it or improve it effectively because in the college and university community, which is where I'm from, they have not done anything. Not just not a good job. There is no training around this at all. That is incredibly bad state of affairs. And one of my hopes is, those of you that are listening, if you would take some time and get involved with your local college and university and just say, hey, not we need more data in there, but we need to have some courses that help people understand how data is actually used in this. It's likely that your data requires a new business data program. And I say that on the business side. It's absolutely critical that you get this out of IT because IT is a project-oriented discipline, as it should be. If you try to run data as a project, you can only achieve limited results, hence the need for a data program to exist. And it's your data strategy and your data management are major data components. But in concert, they must focus on three things in particular. First of all, improving your organizational data. What data do you share across the organization? Now, we're not advocating boil the ocean here of let's make everything better, but let's find out what things, if we make them better, we'll have the most impact on the organizational mission or strategy. Second, you have to recognize that your people do not know how to use data as they come out of colleges and universities. What they have done, though, is they've learned on their own how to use your data. So, again, I'll tell you a story here real quick where I was working in one organization and there's a guy who's got a SQL Server implementation under his desk. And he says, you're not going to tell IT that I've got this SQL Server implementation under my desk because it's an unauthorized piece. I don't have to tell them. I guarantee they know about it already, but they're letting you use it because they know you know how to do this stuff really, really well. And he said, yeah, the only problem is I'm going to retire in two years after 35 years on the job. Who is going to take this on as a next task for them? They didn't have an answer to that question. Only when you have better data and your people know how to use it better can you then improve how people use data to better support strategy. This can only be accomplished incrementally using an iterative incremental approach focusing on one aspect at a time and applying what we call formal transformational methods. So this iterative process is something that the entire thing lends itself to very, very well. And we're back at the top. So let me just summarize quickly. Again, important data properties. We have a very poor educational aspect on this and there's confusion as to whether data belongs in IT or the business. Data management, I've given you some definitions. It's relatively new. The state of the practice is most data is most organization. This is redundant, obsolete or trivial and not well managed. We have some functional definitions that we've put in place. These are coming to be better. Data strategy needs a structural approach. It needs to be simple that everybody can follow on it and it needs to build on foundational prerequisites. I like the theory of constraints. There may be other ways of doing this. But now it's time for questions and answers so that everybody can start to work on this. We've got some upcoming events that we're going to have in here. Again, the August Webinars Data Stewardship. I mentioned the Database Now conference is going to be online. A couple more webinars coming up and it's now time for Q&A. Hi, Shannon. You still there? Let's dive right in. Just to answer the most commonly asked questions, I will be sending a follow-up email by end of date Thursday for this webinar with links to the slides and links to the recording of the session as well as anything else requested throughout. So lots of good questions coming in already, Peter. If you have questions, feel free to submit them in the Q&A section in the bottom right-hand corner of the screen. So what's meant by clean data? From an analytics perspective, is it possible that data is not clean? That is not clean can be meaningful. Absolutely. And let's be real careful on that. Most of the academic definitions of clean data talk about all data being perfect. The definition that we use here is data that is fit for purpose. If the data is good enough to tell us what's going to happen or these movements in the direction or at least eliminate some bad choices, that can still be useful. So be very, very wary of organizations, people, projects and things that say we're going to clean up all our data. What can be useful, for example? We work with a number of health care systems and one of the things funny enough, health care systems have trouble keeping track of is where are the doctors and what doctors have what privileges and what parts of the organization. The organization that has hundreds of doctors would have a major problem keeping track of this, but believe it or not, all of them do. So the question is, does the doctor have privileges at the organization? Well, if that's all you're trying to find out, the doctor is either qualified or not. Doesn't mean we have the doctor's name correct. It doesn't mean we have the doctor's address, the age, whether they are qualified to perform the surgery that they're performing, you know, whatever it is, there's not a lot in there. So data quality is making data fit for purpose. Great question. Thank you. Is a dedicated data management team a must? Yes. Martin Sweet, love it. Think about it for a minute. I mean, I know you were waiting for that. So we're asking people to do a lot of this stuff. We put the second version of this up, which is the dim box. If we've got all these things going on or maybe even just half of these things going on, where's the coordination? How do we make sure that they're all rowing in the same direction? Data leadership is critical to this. And if we don't have that data leadership, it is going to be very difficult for everybody trying to get on the right page. Now, we call that a bottom-up exercise. And there have been a lot of organizations where they're saying, we can't get management to pay attention to this. You know, we really need them to invest in this at a programmatic level. Well, that's sort of what I've dedicated my, you know, whole career to is to try to get people to understand the importance of data management programs in order to do this. But all of these things are different and they have to be coordinated. If we don't have leadership, any word will take us there. If we don't have a strategy, right, any plan will achieve the results that we're trying to get to. So, Peter, along the way, it seems to be that you were saying top-down focus on data is better than bottom-up. Is that true? I think it is, but often, legend into bottom-up, due to fear that other users won't relate to top-down or won't recognize their world. Thoughts on that? A lot of organizations have trouble with this. So, first of all, there's no question that the organizational strategy is what must drive everything that happens in IT, in data, and in the business. So, that's a top-down piece. However, I've worked with more organizations that have been working on this bottom-up approach than I have with the top-down. Top-down is one of the things... Larry English was a friend for many years, and Larry had a rule when he was consulting. If he couldn't walk into the CEO's office and talk to the CEO about data quality, he wasn't interested in working for your organization. That's a very high bar. I've never been able to achieve similar kind of results, and Larry actually had challenges around that, but it was a good bar for him. He said, if the CEO's not interested in talking to me, then I'm not interested in helping out Company X, whatever there are. I used to keep a list of companies that would not participate with him. So, asking for another program in today's business environment is really tough. In order to implement a program, the only possible justification for implementing a program is that the value of the program must yield higher results than the investment. You've got to have a return on investment. Now, what I liken this to is an HR program. I tell people you will no longer need a data program when you no longer need an HR program, and most organizations decide that they're going to keep an HR program as long as their doors are going to be open. Top-down is ideal if you can get that kind of support. It is rare. So, most of the time what we see is a sort of bottom-up or a group effort. Now, the last book that Shannon mentioned to you guys was not called the CDO and Data Strategy, and the reason for that is because I don't really like the term chief data officer. Chief data officer implies that that individual has to sit at the top of the organization. I prefer the term enterprise data executive, and what I tell people is to take and make themselves a sign and say I am the enterprise data executive for my organization and put it on your desk, your cubicle, whatever it is that you're working from. And eventually somebody will come along and say, Peter, who told you you could be in charge of all the data for this organization? In which case I say, well, if not me, then whom? And hand the sign to them and say, now you go find somebody else who wants to do this job and they can have it. And most of the time people will start to pay attention to it. So enterprise data executive implies that the initiative can be done at the workgroup level, at higher than the workgroup level, at the division level, but it doesn't have to be actually a chief data officer type function. I'm happy that we have 4,000 chief data officers worldwide now. You can see out of the population of 8 billion people that's a relatively small number of individuals that we have, but it's many more than we had 10 years ago and that's progress. I'm sure there's other questions around how to get started on a bottom-up versus a top-down, but I've seen equal amounts of success in both initiatives. I've seen a lot of groups that come forward. They build a business case. They drive this up and they go to management and they say, look, if we do this, things get better. And if we do this and things get better, then you should give us more money so we can do more of the things that we do so things will get even better still. That's the definition of investment in your business. And I've seen a lot of that be successful. I've seen other limited success where people top-down, try to do it. The real hard part with top-down is you've got to have executive commitment. And if executives aren't willing to commit to that, it becomes a very, very big challenge for the organization. Again, great question. Thank you for that. So going back to talking about data management teams, we get this question a lot, Peter. What is the center data management team size? Say there's 200 IT employees in a healthcare company. Oh, I love questions like this. So I did one of these with Walmart. And again, it was a fun one. Walmart has a workforce that's well over a million people worldwide. That's a pretty darn big organization. And I pointed out to them that they have about 20,000 HR managers to maintain that million person HR force. 20,000 managers to manage a million people. That's a pretty good ratio for efficiency and effectiveness perspectives and all that sort of thing. And then I turned around and pointed to them and said, you've got John watching all your data. Just doesn't seem like the right number, does it? And I went, whoa, that's really interesting. There's another point you can come to, too. Remember, HR people are to manage the HR workforce of your organization. Even in IT, if you're stuck in IT and you need something to do, you have a network group somewhere in IT that keeps track of wires and wireless points and all sorts of things that people need to get access to. In the networking group, somebody keeps track of the metadata where those things are. And one way to see what the size of your group should be is to go and take a look at the size of your networking group within IT and say how many people does it take to keep track of the network connection point? There is somebody who that job is, right? Because no organization is going to say, oh, yeah, we've got this wireless data point that goes over here. And I don't know who signs onto it or what things they can get to, right? Security people would be going, ah, it doesn't work. So find out how many people in there, and that will give you at least a measure to say, you know, shouldn't we have at least that many percentage-wise people keeping track of where our data is? Because after all, it is our sole non-debleasable, non-degrading, durable strategic asset. Yeah, great question. How to change the focus of IT that manages data on a transactional basis when research needs it in a format that supports analytics? An example of a problem is when the location on a record is repurposed. Regularly, the data element shifts on the layout. Fantastic question. Again, this is the real challenge that we have around these big data, data science analytics type questions here. So people say, oh, okay, well, great. If I just had all this data, I could go off and do something really cool with it, right? Well, the problem is data that's organized for transactional purposes is optimized for speed. Speed is an appropriate way to optimize transactional data. But when I need to go look at it from a different perspective, it becomes problematic. So let me tell you a quick little story about a publishing company that we worked for. We're going to pretend this publishing company publishes Harry Potter. Now, this publishing company had 46 different enterprise solutions. You might ask the question, I thought enterprise solution meant you had one. Well, this company had grown by acquisition and every company that they bought, all the publishing companies that they bought, bought, had their own internal systems. So they had 46 systems of record in this organization. But if they wanted to know how many Harry Potter books were sold in each country worldwide, they had to go into 46 different systems and ask the same basic query of each of those systems, take those results and put them in another system, typically a spreadsheet, and add them up, in which case all that effort produced the answer to exactly one problem. And we had an instance in this publishing organization where they did a bunch of work, they calculated the amount of time it took them to answer this question, and then they found out they'd asked the wrong year. So, oh, I didn't want you to do that for 2016. I wanted you to do it for 2017. Well, guess what? Nothing that they've done up to that point is of use. They have to go back and start all over again. The data has to be reorganized in order to make it useful for an analytic purpose. And again, that knowledge that needs to be reorganized is missing from all except for the people who've been doing this by themselves. But it's on the job learning. It's not part of academic curriculum. Sorry, I hate pounding on the table there, but it is. So, I'm so incensed about this. It's just unbelievable. I hope that answered your question. You're not passionate about data, are you, Peter? Not at all, no. So, how does business strategy affect data strategy, if at all? Fantastic question. So, the only purpose of a data strategy is to support the organizational strategy. And when we say business strategy, remember this works equally as well for the private sector if you substitute the word mission. So, I see all kinds of data strategies. People always like to have you write a book on data strategy. Let me show you mine. Well, if it's more than 10 pages, I don't even bother to read it. It's like, is your organizational strategy there? Again, our organizational strategy at the Wal-Mart level is every day low price. That's pretty simple. And the data strategy, well, as you can imagine, is making sure the data is available at a low price as well in order to do this. So, the data takes less time, less knowledge worker capabilities to do this. The only purpose of a data strategy is to support the organizational strategy. If it's doing something else, you could legitimately ask the question, why? What are we trying to do? So, Peter, there's an additional add-on to that. You know, is organizational strategies then the same as business strategy? Because this questioner does see it as a little bit different. Oh, no, I'm sorry. I equate the two terms. So, the reason I use organizational is because if I say business, then the folks that are in the nonprofit sectors look at it and say, well, that's a business strategy. It doesn't really count as an organizational strategy. I equate them if the strategy of an organization or a business is what is their purpose, right? What are they there for? Apple will tell you over and over again the purpose of their company is to sell great products to people that want to buy them. Right? Well, that's a very good strategy for Apple. It works out really well. The strategy for Samsung is a very different strategy. If you go to Samsung's website, you'll see they have a completely different way of approaching the business, if you will. The data strategy has to support it. I'm using the word organizational and business strategy interchangeably there. Thanks for pointing that out. So, what have you found to be the most effective way to engage senior management in supporting data governance data strategy? Great question. I have a very simple question that I like to ask. I don't have it in this particular slide deck, but I'll go back to my, you know, what is data you're sold non-degradable, non-degradable, durable, strategic asset, right? And there we go. I'm just going to put that up there. I'm not seeing it yet. Yeah. Probably not. I see the header for Keynote. There it is. So, the key here is... Can you answer the question again? Sure. And I'm going to add on to that. So, what have you found the most effective way to engage senior management in supporting data governance and strategy? And there's a follow-up question or an additional question that I should say that probably you can address at the same time. How do we get the business involved? Good. So, the real question is, if you take data as an asset, then your assets should be managed. And would you really want your sold non-degradable, non-degrading, durable, strategic asset to be managed without any governance? And the answer is usually, oh, my goodness, no. There's no way I want that to happen. So, we're going to keep that particular piece moving on. So, again, the opposite question is really the way to turn it on. Would you want your tanks managed incorrectly? Would you want your vehicle fleet not managed? Do you have rules around what people can do when they borrow checkout equipment, when they utilize equipment? And the answer to that is, of course, there's governance around all those things. Why wouldn't your only asset that is not deplatable, not degradable, and durable in nature, want it to be managed as well? Second part? Yeah, how do you get the business involved? Ah, great. Okay. So, it goes back to... and I'm sorry, we've got a little construction going on here. I was trying to make sure we didn't get some banging going on. See, the real key is that, first of all, getting a couple of quotes from your IT folks is a really good thing. Don't, you know, put names associated with this, but I... people are amazed when we tell them your IT department has a lot to do. There are many, many things that they're taking care of. Do you think anybody in that sits down and takes a look at anything inside your database and says, wow, I think I'll do some data quality work on, you know, the personnel database today? No, it does not happen. IT is slammed. They have tons of things that they're trying to do. And they think 100% that it's a business problem. But, of course, the business people look around, say there's somebody with a title, Chief Information Officer. What else would that individual be doing? And the answer is a lot of things, but not around data. So, very, very difficult to make people see that there's a gap in there. But one of the things that you can do is a little experiment and exercise. This is an exercise I got from Tom Redman. Tom has done some webinars for data diversity in the past, really good friend. And he says, just on a Friday afternoon, you know, he says, miss a golden beverage, so no problem with that. You know, sit down and pull the customer records for your top 10 customers and just kind of walk through them from the business side and say, does this information look correct? Now, we've never found this to not achieve really good results. He and I have done it a number of different times. You spend two hours on a Friday afternoon looking at your top 10 customers. You will find all kinds of data quality errors. And all of a sudden, all of a sudden the business starts to go, oh, oh, the address is wrong. So when I mail them the check, it doesn't arrive. And they get mad at me, but I mailed them the check so they shouldn't be mad at me. You know, little details like that. In fact, let me give you another example that we did recently. We're working with a company out in the Midwest, a logistics company, a really, really fine organization. And we're there doing something else. But in the process of doing that, I walked by in sort of a big bullpen room and there were, you know, 100 people and they're running around like crazy, moving things, papers flying around. And just being a, you know, nosy kind of person, I said, what are they doing in there? And oh, got ahold of the manager and he comes over and says, oh, I just want to show you here. He said, we've got all our data in a mainframe and it's really terrible. So when the bills are put together, the customer's name is wrong, the data service is provided wrong, the amount we're charging them for the service, the date the service was provided, the type of service, it's all wrong. So these guys are running around here like crazy, making sure the bills go out and don't embarrass us. I said, oh, what's the average month of time? He said, well, it delays the bills by about a month, but you know, we're okay. I mean, at least we're not being embarrassed. And I said, you know, we can fix that, right? Good data manager practices will help to improve that and you won't need the 100 people in the room. He says, what do you mean? I just got a raise. I just had the best quarter of the best year I've ever had. I'm thinking I'm going to have 200 people in this room because that's my measure of success. Now, hopefully most of you are cringing at this point and going, oh my, somebody missed the point here. Well, I walked down the hall to the CFO's office and said, would you like to improve cash flow on $9 billion annually by 30 days? And the answer right back from the CFO was, if you can do it for less than $800 million in a month, you've got a positive return on investment. By all means, let's have a conversation around that. And of course we did. So you've got to put it in terms that the business understands. They're not going to understand normalization, optimization. They're not even probably going to understand what a data warehouse is except it's a thingy and it's got some data on it. But beyond that, they are not going to understand the function. So you've got to translate this into terms that the business can understand. If they understand those terms and they understand that data is causing the things that are going wrong in the organization, then you have a much better approach. And Shannon, I think we're going to address this when we get to the monetizing talk. I've got some techniques that we're going to do on that. Indeed. I have no idea where that is. That must be the November talk we're going to do. But anyway. So let's talk about the jobs within data management. So Enterprise Data Executive, do you really see this working in an organization? Yes. And the organizations that are doing this, Todd Harbor and I wrote the Enterprise Data Executive book. And one of the things we observed in the year so that we took to do it, plus a couple of years of research on it, is that the tenure of most chief data officers is about a year. And that's not good right at the moment. Now, CIOs started out exactly the same way in the 90s. Nobody knew what a CIO was. And they would try it and it wouldn't work. And they'd say, okay, well, we tried it and it sucked. It didn't work. So therefore, right? As opposed to looking at that this is a growing immature profession. And again, I don't mean to criticize this when I say immature. Literally there are 8,000 years of accounting tradition. We don't have that in this area. So we put into the book kind of a little mantra that says, the first CDO is probably not going to be successful. Because the things that you need to change in your organization are so wrenching that the individual who's in charge is likely to use up a lot of intellectual and political capital. And when they use up that capital, they then don't become credible anymore. So in many organizations, we're saying the first CDO should have a different role from the second CDO, which should be different from the third CDO. Each one will become more mature, moving through the phases of CDO-ishness in a much more rapid fashion on that. I think that answered the question. If not, again, push back and let me know. I think so. And CDO-ishness, I like that word. I'm going to have to use that. We'll do a conference on that. So continuing on roles and responsibilities, what are the roles, what are the differences between an architect and a DBA? Great question. And actually, we got one of these. They came from a small city in California and it's an IT manager. He says, gosh, you know, I understand the wheel and I understand that we should do these things, but nobody can tell me what these people do. So just picking those two, the DBA, if you will, is the next slot over to the right. So the data architect is the one at the top and the data modeling and design is where the DBA goes. And I'm going to actually shift to another slide here to answer that a little bit more directly. On this chart here, the DBA would be responsible for one or more of the databases that are here. In other words, their focus is looking at the details here and down into the programming level. So the green database may have a high-performance requirement that requires it to respond much more quickly than the gray database. And the orange database may be only used once a month or something like that. In order to do all of these things well, you need to have somebody take an overall architectural perspective. And this is the data architect's perspective, is to make sure all of those pieces can be taken care of. So we have the data flowing through everything, as I showed you previously on that slide. So let's put that up. Architect is going to have a much broader perspective. The DBA is going to be down tuning the database, trying to make sure it does what it's supposed to do to answer the specific business needs. The architect is going to be working on things that are at a much higher level of conceptualization. And frankly, this works exactly the same way in the housing market. You have folks that are an architect that tell you what type of house that you want. Do you want a nice bungalow? Do you want a really modern house? Whatever it is that you want. The architect is not going to be concerned with implementation details. That's where the DBA gets to. But the architect is going to say, you know, if you've got three kids, you probably want to have at least four bedrooms so that you can have a study as well, or whatever it is that you're trying to build. Or if you've got a lot that's 150 feet wide, you probably only want to put the house in about 75 feet of it to give you a good sense of proportion relative to the rest of the world. A lot of aesthetics on these types of things, although aesthetics are not terribly helpful when you're trying to convince management to invest in this stuff. All right. We just have, you know, a couple of minutes left. Anything else you want to add to in terms of roles? I just see roles growing as companies continue to focus on their data more and more, and as technologies evolve to incorporate more and more data and real-time analytics and so on and so forth. I went to a conference recently where we, that was a topic with data scientists versus data engineer. So where do you see it? Yeah. Well, the real problem, again, with data science is that they are doing a great job learning how to apply algorithms to data. But the problem is that algorithms don't work if data isn't in the right format. So we had two parts of it. One was the quality pieces we talked about as well. Again, another quality exercise that Tom Redman taught me was that if you look at the dates of birth in a data set and they all happen to be born on the first day of the month, you know, that's probably not correct, right? Or if you should have a reasonably balanced data set and they all turn out to be male and there's no females in your data set, you know, there's a data quality problem. But the other part of it was the organization and one of the questioners asked this as well. Data that is optimized for transactions is not necessarily going to be supportive of the analytical processes that people need to apply to them. And how does the data architect fit into the data strategy? So if you are trying to do a data strategy, the architect should be the person most knowledgeable about the ability to translate the organizational strategy into its various components. There's a people component, there's a process component, there's a technology component minimally. There's many other components that can be included in there. The architect is going to know the characteristics of the data and say, I can deliver this data to you very rapidly if you have these requirements. But if you have this requirement, that will take us longer to deliver to it. And again, we get into the flexible, adaptable, less risky types of approaches. All right, Peter. Well, that is the questions we have and that does bring us to the bottom of the hour there. So thank you. Just a reminder to everybody, I will send a follow-up email by end of Thursday for this webinar to all registrants with links to the slides, links to the recordings. There's a question about additional places to find formal transformational processes. We'll get that out there as well. There's also links to the history and the archived recordings as well, as Peter mentioned earlier. And as Peter is showing right now, you've got the upcoming events. We hope to see you in August for getting started with Data Stewardship, another hot topic. Peter, thank you so much for this great presentation and thanks to our attendees for being so engaged in everything we do and all the great questions. We hope to see you all next month. Peter, thank you. All of you, and thank you, Shannon. As always, a pleasure to work with you all these years. Likewise. Thanks all. Take care.