 Hello and welcome. My name is Shannon Kemp and I'm the Chief Digital Manager for DataVersity. We'd like to thank you for joining today's DataVersity webinar, Expressing Data Improvements as Business Outcomes. It is the latest installment in a monthly series called DataEd Online with Dr. Peter Akin. There's a couple of points to get us started. Due to the large number of people that attend these sessions, you will be muted during the webinar. For questions, we will be collecting them by the Q&A in the bottom right-hand corner of your screen, or if you'd like to tweet, we encourage you to share highlights or questions via Twitter using hashtag DataEd. And if you'd like to chat with us or with each other, we certainly encourage you to use those. Just click the chat icon in the bottom middle for that feature. And to continue the conversation and networking after the webinar, just go to community.dativersity.net. And to answer the most commonly asked questions, as always, we will send a follow-up email to all registrants within two business days containing links to the slides. And yes, we are recording and will likewise send a link to the recording of this session as well as any additional information requested throughout the webinar. Now let me introduce to our speaker for today, Dr. Peter Akin. Peter is an internationally recognized data management thought leader. Many of you already know him or have seen him at conferences worldwide. He has more than 30 years of experience and has received many awards for his outstanding contributions to the profession. He has written dozens of articles and 11 books. The most recent is Your Data Strategy. Peter has experienced with more than 500 data management practices in 20 countries and consistently named as the top data management expert. Some of the most important and largest organizations in the world have sought out his expertise. Peter has spent multi-year immersions with groups as diverse as the U.S. Department of Defense, Deutsche Bank, Nokia, Wells Fargo, and the Commonwealth of Virginia and Walmart. And with that, let me turn everything over to Peter to get today's webinar started. Peter, hello and welcome. And thank you, Shannon. It's so good to be with everybody today. We've got a fantastically hot day here in Central Virginia, but I suppose that's not a whole lot different from anywhere else it is in the country at this point. Again, good to have everybody with you. I was talking to Shannon right before we got started here, and I said to her that I thought this title was kind of academic in nature. I probably had my academic hat on when I thought it up, and I was asking her, did she think that this would be a better title? How to get your data initiatives funded repeatedly? And she said no. And so I said, okay, that's what we will keep it as. I say that because Shannon has her voice into everything that's happening in the community out here. And so as you guys want these different events and want different topics and things, I all mean set it in to Shannon. She will absolutely get you straight and try to get you some sort of a program up and running on these things. So don't worry, Shannon, I'm not signing up for a bunch of custom work here. Anyway, our topic today is expressing data improvements as business outcomes. And that's really a challenge for. So I'm going to start off with talking about business cases. And business cases generally involve the concept of leveraging. But we've got to have leveraging exist for a purpose. So leveraging in order to and you as the data experts on this call are going to have to be the ones to fill in that particular blank. This refocuses the request around business outcomes. And each business case that you put together generally has to have at least a dual purpose. One, you've got to make sure that the case fixes the problem, whatever it is that you do, you don't go to your boss and say, hey, if I do this, it'll fix that problem. And then hopefully it fixes the problem. But secondly, each case also has to illustrate why a programmatic approach to these kinds of problems is preferable to the individual one off fixes that we've been trying to deal with for the last 30 years. Now, the question comes up. What is it that we're trying to get better at? And I have some data that I'll share with you, which shows that data challenges are the root cause of most business and it failures. I'll give you a two part example, having to do with healthcare.gov. Some of you may remember that from a couple of years back. It actually has some very interesting data pieces to it as well. And we'll talk about the limits of technology based approaches period. Then we'll move into a little section here where we talk about why we need to make sure that these data outcomes, these data improvements are not labeled as IT projects. And the first reason is because they're not admirable results so far. The IT community has not done a great job of delivering with value on time, et cetera, et cetera. We'll talk about sequencing too because it is important to make sure that some things are done before other things. And that's part of the reason that most of our organizations are in the shape that they're in is because we haven't done those things for a while. Just imagine not ever maintaining your automobile, for example, it would not work out there. And we'll look at a data leadership agenda in particular as an example of how that can be changed around on this. And the real question of how we get better is a combination of leadership programs, math, engineering, architecture, storytelling, and practice. And we'll finish up with some takeaways about an hour from now. And more importantly, look for the half hour Q&A that we're going to talk through as some of you guys go in and try to poke holes in this or give me better suggestions, which has always been the case. Again, stop here, but the way this community has moved forward is because you all make suggestions, we listen to your suggestions, and then we try to respond to whatever it is. So you'll see some of the diagrams in this presentation, our diagrams that other people have contributed, and I've tried to make notes on it there. So let's jump in and get started. And the first thing that we essentially want to do is to help everybody to understand that better organized data is more valuable than unorganized data. And if somebody has problem with that statement, take a book, take the spine off the back of the book, mix the pages up and hand them to somebody and see what a good experience they have reading that book. Again, better organized data is clearly more valuable than data that is not better organized. And poor data management practices, therefore, are causing organizations time, money, and effort. And this is even more incredible when you understand that 80% of all of the data in your organization is rot. Now, of course, then you say the question, what is rot? Well, rot is TLA or a three letter acronym. And the key here is to look at it as data that is redundant, obsolete, or trivial. My wife likes to correct me on this and says it's really redundant, incomplete, obsolete, or trivial. She's correct. But I think it's more fun calling your data rot than it is a riot around all this. So the question is, which data do you eliminate? And of course, you can't eliminate it unless you know what are the characteristics of that data, who are the users, how is it used, and whether there are overlaps between this data and some other data that may be closely related to it. Let's talk a little bit about leveraging. And the key for leveraging is to understand that this is an engineering technique. And unfortunately, we simply do not teach engineering techniques in IT school or business school or wherever else it is that you happen to learn your data skills from in here. So leverage, again, the idea is a human can lift a bulk that weighs much more than the human using the proper techniques. And if we look at that now in the context of our data leveraging, what we're really looking at over on the left hand side of the screen there is organizational data that 80% of it is rot, but nevertheless the good stuff is still in there so we have to pay attention to it. And then we of course want to apply technology to this and technology and data are very important related concepts in here in this diagram here I'm simply showing you that the lever is the technology. Now, when I say the lever is the technology of course, most of you know there's another component which we'll get to in just a minute but you could move a larger object by putting this lever up against it and prying upwards. It's of course more efficient. You get more leverage if you actually employ the technology correctly, which is to employ a lever and a full crop in this case so that you can actually do the listing and you'll find that's much more productive than doing it the other way. Well in the data context, if we take the metaphor slightly forward, the people part of this equation are the knowledge workers and they are often supplemented by data professionals in your organization. But one of the cool things about working with over 500 different organizations is that we can always walk into an organization and find other people who are business people who really like the data. And they're just looking for other people that want to talk to them about the data stuff. So our knowledge workers are a key component to this of course our process again is very simple. As well we use the lever and the fulcrum to get the most lift on our organizational data and that people process and technology trauma is the three legged stool that we'd like you to build all of your data projects on. We'd like that of course to be guided by strategy that's a different topic but still cool graphic and all the rest of the things in there and notice here. I've shrunk the organizational data bubble by just a little bit because reducing your rot also increases your data leverage in your organization. The data leverage is a multi use concept it allows organizations to better manage their data both within the organization and with our organizational data exchange partners and all of those exchanges can be done in support of the mission. So the leverage here is obtained by the implementation of data centric skills technologies and human processes and focuses on the non rat data, assuming that you've been able to identify some of that at first. The bigger the organization, the greater the potential leverage exists on this. Now, when you look at this, what you see here is that when you treat data more asset like two things happen at the same time. First of all, it lowers your organizational IT costs. And I have yet to find an organization that would not like to take 20 to 40% out of their organizational IT budget and doing this by data is actually a really, really powerful technique, but probably even more important and really the subject of the next book that I'm going to push out here through Shannon. My colleague Todd and I are working on right now is that it increases your organizational knowledge worker productivity. Imagine this back to our metaphor with the book in the pages. If your knowledge workers are getting paid money to sit around and put pages in a book together where if they're all putting their own piece together. It's clearly more effective for the organization to do it one time. It copies of it and hand it to everybody than it is for everybody to sit around and become a book binder in the process. In fact, it's so important that I use an entire slide to this. Our friend and collaborator Randy Bean who runs the CIO network out of. But it's his company called I forget, but anyway. I'll forget he's done this survey a couple years in a row and his question was. What are the most important things that you're dealing with and it turns out that technology in 2019 was only 5% that people and process aspects of data are the places where organizations are having the biggest challenge. And the reason is because we've let it continually focus in on just technical pieces of this. So again, look at the potential leverage that's here. Now, of course, one of the things we look at when we get a result like this and say, oh, great stuff. Is does it sustainable? And yes, in 2020, he also did the same question and found out, well, it's 10% 5. I'm still happy at the 80 20 split, but clearly it is a lot. Let's just take a look at a couple of different things that we might do this on the left hand side here. I've got Dilbert, of course, for the engineer and he says something like clean from data. Well, the PHB, the pointy here boss on the other side does not understand what clean from data is. But if Dilbert had instead said decrease the number of undeliverable targeted marking ads. That may be something that was on the PHB to do list. And so that may be something that would be more receptive than just clean some data reorganize the database. And we think, how about increase the ability of the Salesforce to perform their own analysis? Now we're getting to something that actually makes a difference develop a taxonomy something many of us would love to do. How about we create a common vocabulary for the entire organization. I guarantee you this happens in board rooms as well as in operational areas. I've not found a border in my life that did not say, oh, yeah, we were talking about tanks and they were the things that blow things up and somebody else. No, no tank was what you were oil into. Again, different, different concepts on all this optimize a query. Again, sounds like a great idea. How about this shave one second off a task that literally ran a billion times a day. So if I could save a part of a second a billion times. Yeah, it adds up and the numbers were astounding in this case. Here's my favorite reverse engineer the legacy system. Understand what is good about the old system so we can formally preserve it and what was bad so we can improve it on that makes a little bit more sense. So these are the types of things that we need to do in order to refocus our outcomes on this. A little further, I mentioned the dual purpose early on. All right, so let's do some simple math. If X is invested in a Y project, then outcome Z will result and hopefully the outcome Z is greater than the investment X pretty straightforward. Of course, at the beginning of the project where the parties know the least about each other, all are expected to agree on the meaning of price timing and functionality. We know that's that's very difficult. More to the point though, let's define X as some resources so time and money and effort to find why cleaning one set of data and the that data will be clean. You can see this is hard for management who spends their time paying attention to data and everything else that goes on the organization. So what? That's a fairly reasonable response from them. So let's recast this just a little bit. If $100 is invested in cleaning one set of data, and then I can show you that that $100 investment produced $1,000 return. So what disappears? It's a very obvious thing for people to look at in this. And yet we have time after time where people still don't get that. And part of the reason is this. The data programs that we do, and I spelled the programs with the British spelling on it here to distinguish it from programs as in software on this. RIT people have gotten good at building things in there, but data doesn't work like it. Data evolves. One of the wonderful things about living as long as I have and I'm only 61. So it's not that old. But living, you know, working this for 30 years to said I've gone back and visited companies, including my colleagues at the Defense Department. And guess what? After 30 years, most of them are still managing exactly the same data. Their processes may have changed. But the data evolution needs to be separated from. It needs to be made external to and proceed system development activities or a little bit more. The same data management and software development must be separated and sequenced in order to do this because the two things just don't work well together. Now I'm not sure of any etymology on this particular piece, but it does at least speak to the dichotomy in here. Good people trying to do good things and not achieving success. And part of the reasons I mentioned before is that data is simply not a project. Data as an asset has a useful life of more than one year. And so while you may be looking at project deliverables of two week sprints and 90 day increments, data evolution has to be measured in years. And that's one of the first questions that I speak to in management is to say, look, do you realize that the effort to take your data, the timing, the measurement intervals that we're going to be using, are measured in years? Yes, we can do some things quarterly and achieve some good results. But we're certainly not going to fix anything by the end of the week here, right? Data evolves and it's significantly more stable. And as ready made data architecture components, those vocabulary items that I was speaking to you about just a minute ago. These are a prerequisite to any sort of additional development, but especially agile development. And I only pick on agile here because agile is a really popular way of developing higher quality software faster. It's good stuff, but that's not what we're doing with data. And so the only alternative if you're doing data in an agile fashion is to create more additional data silos. There's nothing wrong with that, but it nevertheless can be a problem. I'm pretty sure most of you understand the difference between a program and a project. So I'm just going to list these things up here and briefly note them. This is more of a reference slide for you. You know, programs are ongoing projects and programs are tied to a physical calendar. Program management has governance. Gee, so the governance, right? Programs have a greater scope of financial management and it's an executive leadership. Change management is a clearly an executive leadership. And I get questions from people all the time and say, well, how can I help management understand the need to do data in the same manner? And the answer is you need to tie your data initiatives to your HR program. After all, nobody in your organization simply goes, well, I think we've hired enough people. We're not going to hire any more people and the people that we have will behave well. We're not going to need lawyers. We're not going to need anything. We could shut the HR department down, right? No, you will need a data program in your organization as long as you have an HR program. And if you don't have an HR program, then you can start talking about eliminating your data program. So very important to tie this in management's minds to this. And it's actually quite a bit of parallel hill. John Ladley, he was telling me about this the other day where apparently HR did not exist as a centralized function in organizations until about 70 years ago. So we're at 2013. So mid 1940s, 1950s or so, you'll see the rise of HR departments. Guess what? Before HR departments were done individually in each department, maybe even down to each workgroup. And as you might imagine, each workgroup learned different practices, which meant they achieved different results by the application of different methods. And HR was, let's just say, methier than it would have been. All right, so that's the dual purpose. You're trying to help people understand that your business case will fix the problem. No problem with that. But also that you need to do this on a regular basis. And instead of spinning up a workgroup and then having them learn this stuff and then go back to what they were doing before, it's actually more productive for most organizations to maintain a programmatic capability around data. So what is it we're trying to get better at? Let's take a look. First of all, from an implementation perspective, when you start to do data, you sort of have a blank canvas. There's lots of people that are out there doing things. And that's wonderful. And data leadership kind of gets this stuff. And if you follow the guidance that's here, most people say, hey, that means we're going to have to put in some data governance. The only problem, if you will, of data governance is that data improves over time. Unfortunately, some people don't like to find out that it's going to take years for data to be improved. So most organizations, then after they start that and kind of realize it, then they come back in and say, but I need to get some data improved now. And so I want to do something that happens faster. So we make a SWAT team, or we may make a special thing, or we may hire a consultant or we buy technology. We're on technology later on, but we'll come back to that. The key here is, of course, that it's real obvious in an initiative in a data improvement project here that the data was improved as a result of focus. The data also benefited from governance. I'm not taking anything away from governance, but this is still nevertheless a challenge for us. Now, as our feedback loops get better and we start to put in place the infrastructure that we really need to have in order to do data well, which includes storage and community participants and things, we start to get a little better at this process. And again, notice I have two things written on the two boxes on the right. Data things happen and organizational things happen. And you'll notice the equivalency that I have there is sort of a fading one. We're just not very good at it. And we have to get better. When we make a data thing happen, and in this case, gets more money, achieves the mission faster, whatever it is that we're trying to do, we need to celebrate that to show people that something happened in data, and therefore something good happened in the organization. And that's, again, good. We'd like to do that. But nobody's going to listen to us celebrate forever. So now we need to go back in and do a little bit more and say, these things happen with data over time, but we're also looking at synergies that can happen. We're going to notice two of the X's combined to produce a bold dollar sign more that we get out of it. And we would have, if we tried to do it to them initially, there's a second one right there. I put it next to the T by things happen there. And finally, we may be able to actually figure out that if we can get things together, then we can really start to make some good improvements as we go through that. While this chart shows a process, it still shows our biggest challenge on the right hand side is that we're pretty good about showing how data things happen. What we've got to get better at is how to show that when data things happen, organizational things happen as well. Now, let's take it to the next level here. Most of you are very familiar with the dim box on this. And we finally got my new version up. Thank you Chris Bradley for getting me the proper graphics on this. I keep showing the old version because I couldn't find the new version graphics on this. Well, let's see how this would work. If we're looking at a data program initiative here, in this case, we're calling it a strategic data initiative a data strategy if you want on this. Most of the things that happen in data, just the same way as we need to have people process and technology if we're going to make any process are really a combination of three of the dim box wedges in here. So the story here in this case goes, it's one from Michelin Casey who did the forward for our enterprise state executive book on this. I thought it was a really good story, so I love to tell it. She's visiting an executive and the executive says, Michelin, last time you were here, I told you my data was really terrible, right? And so I got a data warehouse and that didn't solve the problem. And of course, we know the answer to this because most of the time when data warehouses go up, we don't do all of the work that we need to have. So, as I mentioned, a warehouse here would be a combination of governance quality and the warehousing initiatives showing in this case that we're going to have to be good at at least three things trying to get this first data things happen, and then business things happen. And that works perfectly well. And of course, however, the second cycle because most of the time the first attempt, particularly for organizations that are starting new in this are not going to be just ready to start on this. So a second version of this might look like this. We're going to do data warehousing, we're going to do governance, but we're also going to pay attention to metadata in this case. Now notice what's happened here. We've done two cycles already of data governance and data warehousing, whereas we've only done one cycle in this case of metadata management. And it's pretty clear that our expectations ought to be able to look at this in order to do this. Finally, third effort at this, maybe we discover at the very end of this little hypothetical story that we need to actually not be so much metadata but really look at it as a reference and master data problem. Again, we've done data warehousing, we've done data governance three times on this already. So we are starting to get better at this. Now, as we go through this, it's also really important to understand that we're not going to be great at these five basic data management practice areas here. So these are out of the CMMI Institute. Again, can we manage data coherently with a strategy or all the work groups pointing in the same direction? Do we have a class of professionals that we can now call professional data governance personnel? Are we managing this using the right methods? Are we maintaining the data at the right level of quality and are we using it with the right technology stack? All of these things are critically important. And of course, you need to have the supporting processes from the organization as well. But one thing that's very critical here to understand in this case is that the entire chain is only as strong as the weakest link in the chain. And I'll show you how we go about doing some baselining around this. You can do this yourself. You get one point if you're just starting out and you have a pulse. You get two points. However, if your process is somewhat repeatable, we call it managed in this case. If my process, for example, with dealing with data was simply give it to Peter, that's a good process and you would get two points for it. It's probably not a great process because Peter, of course, if he hits the lottery or something may decide to go elsewhere and you may news all that knowledge. So we eventually ask Peter to write some stuff down, define it, right? And each of these defined areas, very, very good way to look at this. Any documentation at all gets you three points on this. Four points if you get to measuring things. And this is really the key up to this particular piece. We have to use those measurements that we have to determine where we're going to go. And finally, if we start to look at all of the data that we have collectively, then we can start to change the way we do the actual processes of processing data and we'll start to do better. Let me take it one step further, give you some measurement points. Each of these in this hypothetical example here that I'm giving you is rated internally by a three. That means they have defined their processes around these areas. But without a strategy in order to do this, it is majorly problematic. So everybody's trying to do something. But here is a situation where this organization could put a literally million dollars into data quality and it wouldn't help their organization because they don't have a plan for how they're going to deal with this. We get to a couple of examples. Again, I mentioned healthcare.gov. Some of you probably have forgotten about this. Those of you that are too young. This was in the middle of the Obama administration. They were rolling out a new website that was going to help people with this. And it was a really good initiative. Unfortunately, people associated the fact that the website had problems with the program itself. And so that's one of the reasons we have all the turmoil over Obamacare. Now let's look at the post mortem on this. I've already told you it was a challenge. The first challenge from a data perspective was that there were 55 contractors involved in healthcare.gov. Now I don't mean 55 people. I mean 55 companies that were all trying to get this $500 million contract to build this thing. That's just the wrong number under any circumstances. And why is it a data problem? Because these vocabularies, these vocabularies, these contractors were not speaking the same vocabulary. And just the process of trying to get 55 organizations to agree on a vocabular is a major, major challenge. The second piece that came from this was in six weeks from the launch, literally. So I want you to imagine they're about mid-September, sorry, it's October 1st. They're about mid-August, right about where we are now that year. And they have not finalized the requirements. I'll give you a very particular one from a data perspective. It had not been decided six weeks before the site launched, but I'm sure it was important to get people's zip code information so we could find out what state they lived in, and therefore what options were available to them before they signed onto the system. If we did that, it required a different set of data structures that if we let everybody look at everything, but then we risked the situation of somebody from Virginia looking at an option in California and being disappointed that they weren't able to get to that. So you get a couple of comments on here. John Johnson, who is the established group chairman says anybody who's written a line of code or built a system from the ground up can't be surprised or even mildly concerned that it didn't work out of the gate. The real news would have been if it actually did work. The very fact that most of it worked was a success in and of itself. Let's look at two more problems though. One of the things that we've stopped teaching in college and university is the actual process of design. It is atrocious. For 30 years, we have not taught people how to design software and large systems. So Marty Abbott was one of the people that came in as well. He said it's pretty obvious from the first look that the system hadn't been designed to work right. Any single thing that slowed down would slow everything down. I want you to imagine driving your car and putting on a turn signal. And if the turn signal light wasn't working, the engine stopping. That's a bad design and we would obviously see that's a bad design. The other piece of this was that the software that was using the data on the actual website was programmed using traditional relational technologies. But somebody on the front end said, hey, we're going to use big data. They weren't using big data. They were using big data technologies. And the interesting thing was part of the group, the group that was doing the software kept saying, here are the queries that we need. And the people on the other side said, we don't know what a query is because we're doing big data. Again, these are just atrocious things that happened here. Let me give you a second example. That was a system where none existed before. Here's a systems take a look is we've got data challenge. In this case, this is a data catalog by a large federal government agency where they had millions of stock keeper units maintained in the catalog. These stock keeper units were the primary keys that we need to do them. But all this information was stored in a comment field, their text field. And the reason was because we previous data in here had been stored in a hierarchical database and they didn't know what to do when they replaced the hierarchical database with Oracle. So they stuck it in the comment field. That's a good idea. Now they were going to move to SAP. So they said, well, great, we'll just do a manual extraction. It's going to take us a long time. We're going to have to hire a bunch of people to do this. And it leaves the data structuring problem completely unresolved. So let's take a look. This is a proprietary, improveable text extraction process that we developed for them where we converted the non tabular data into tabular data. And I'm going to pause on this for just a second because this always gets some attention. There are all kinds of people out there selling snake oil telling you they can convert your unstructured data into structured data. If that's the case, I would hand them a glass of water and ask them to turn it into wine for me. The definition of unstructured data is that it has no structure. So what I think most organizations are trying to do is convert non tabular data into something that's a little bit more tabular like in order to do this. Again, words matter. We have to be on the same vocabulary. The other reason I like to do this particular story was that I actually saved the government $5 million. I thought that was a wonderful thing. I'd rather not have the government spend money on that. And it got me to my first person century of work. So I saved literally a person century. Let's take a look in most organizations and not just organizations, but also the way we've socialized our algorithms people, which is what I like to call the data scientists in here. There is no context given. They're simply told to optimize. So let's just take a look at what happens here. Each week of this exercise cost a fixed amount of money. I'm going to say it was $10,000 just for the purposes of adding and making this very simple. So this what I'm showing you is that we've spent $40,000. And at $40,000, we've actually accomplished some things at the end of the first four weeks, we had solved half of the problem. So instead of people sitting and looking at this screen and trying to figure out what to do, we were parsing the data and coming up with algorithms that could show us what were the master data items and what the data structure should look like. We also had learned in these first four weeks that 12% of the data was absolutely rubbish and we could ignore it entirely and that the unmatched items was going to go down as the number of matched items went up. Now the question is, if we solve 55% of the problem right there, should we stop? The answer was probably no, and particularly at $10,000 a week, it was not an expensive process. So I'm going to run the clock ahead here now to week 14, and we have now spent $140,000. Still not a lot of money, but nevertheless some money that we know about. And this group said, wow, you have solved 90% of our problem. What we need to do is we've got one more pile of data that we'd like to get, one more type of data out of this piece. And so we spent some time on it. We, 9.06, we went up to 9. The reason those numbers from week 14, 18 go up and down is because we get different information from the subject matter expert. And while that's good and bad, it's still nevertheless problematic on here. So we finally did capture that other piece of data. We were absolutely confident that we could literally not even move 22% of the data. And our solution rate had gone down to 70%. So after 55%. So was it worth that other $100,000 to get that additional percentage? Our customer in this case said yes. This is the original problem. We're going to have to do all that work manually. I don't know about you, but I'd rather do that last chunk manually than the entire chunk manual. And at $10,000 a week, it becomes very easy to do this. Now, again, my guys did not get $10,000 a week to do this. This is two software engineers working full-time. If I could get $10,000 a week for them, we'd, maybe I would hit the lottery as well and go forward. Let's take a look at the baseline. So for this piece to go forward, we just put a line in the sand here. And so there's 2 million national stock numbers or stock keepers units on this. Five minutes to cleanse each one of them. How much work do we work? And this 92.6 person years close to the bottom there at 93, I rounded it up, is where I get my $5 million. So it's 93 people working at, if I can find contractors to work at $60,000 a year, I'm doing great. You'll notice it has DLA in there, so I guess I have to tell you it's the Defense Logistics Agency that's doing this. Let's now look at the revised solution. So if you watch the right-hand side of your screen, you'll see the orange numbers come up. $150,000 on this as opposed to originally 2 million NSNs. Again, I don't know about you. I'd rather work with $150,000 than 2 million any day, which takes our total time to clean down to 750 minutes from 10 million minutes that we started out on. And 7 person years in order to do this at a cost of only $420,000. So there's my $5 million savings right there. And also, of course, real questions for everybody. Oh, and I'm so sorry. I'm going to go back a slide. Real question for everybody when I get done with these slides is to ask them, what's the most important number on the slide here? And the answer is five minutes. Everybody believes that you can clean any of this data in five minutes. You're doing great. So if we, I don't know, double it. Yeah, exactly. We can't get to that spot. So there's $10 million. We double it. $15 million. Turns out it's about an hour per to actually do this here. So the number is astronomically larger. So it's always a good idea to have somebody in the crowd actually do that particular piece. So let's take a look at why we need to do this in a way that's distinct from IT. And the first reason is simply because IT has been still working on this. These are the standards group numbers from 1994 to 2018. You can see there is steady progress. However, still one third of all of our projects are challenged on speed functionality on price levels. And I don't know about you guys, but if my dentist came to me and said, I've got a 30% chance to pull in that tooth correctly out of your head, I would say find me another dentist as quickly as I could. Now, the reason I show this other chart here on the chaos group is because the only projects who succeed more than they fail are smaller. And I need to show you that because that's what Agile is about, right? Agile is about making things more manageable. And if you can do that with Agile, that's great, but you can see that's pointing us in the opposite direction of data programs, which cannot be done in that Agile fashion in order to come up with really good mistakes around it. Standust gives you five cards of a winning hand in order to do this. The project needs to be small. Well, projects should not be allowed to begin until data requirements for the entire project are verified because if you discover your data requirements are messed up in the middle of your software development, you're going to have a bigger problem. Same thing. They're number two piece. The project owner must be highly skilled. Few in IT or management have the requisite data skills and knowledge in order to do this. The process must be Agile. And again, they're making that as an Agile comment, not the Agile piece, but nevertheless, the Agile product is still very, very useful. However, it's a construction technique and data requires much more planning than construction because we don't tend to build lots of new data sets for people. We tend to use our existing data sets and merge new things in with it. The Agile team must be skilled. Sure, no problem, but Agile, nobody who trains on Agile gets data skills on it. And finally, the entire organization must be skilled with high emotional maturity. Obviously, few organizations do it. So just like the last slide, this shows that trying to do better IT projects should not be done. I can't see the comments that are coming, but I did catch one that just sort of blipped by me. And the question was, does Agile collect requirements? And the answer is yes. That's why you have the users in the room to do the requirements right there with them on this. All right. So let's talk about enforced sequencing. This is an important aspect of this. It is real critical that you do this. Now, this is the actual place where I'm talking to you from. This is my house in Montpelier, several years back. Right. And the several years back piece of this is that I'm a horse husband. And so part of the deal with my wife was we were going to build a barn so we could enjoy our horses. Interestingly, though, these pictures also document the passing of the foundation inspection because I borrowed money to build this barn from a bank. And the bank correctly said, you're going to not, we're going to give you exactly this much money to build the foundation. And before we go any further, you are now going to have to get a foundation inspection to prove that you have a good foundation onto which you will build a good barn. If I built a good barn on a poor foundation, absolutely everybody knows that my horses would come first and the bank would come second in order to do that. Again, and for sequencing here as well, Maslow's hierarchy of needs is kind of like this and I'll tie that together on the next slide. But Maslow, you may remember from school said, you know, if you have food, clothing and shelter needs that are unmet, then you will never be safe. So physiological safety needs to be there before safety can be there. And safety needs to be there before love and belonging because if you're not safe, you can't be part of something that's bigger than yourself. And if you don't have an association with something that's bigger than yourself, it's hard to know how you fit into that picture. That's your self-esteem that comes up. And only when your self-esteem is taken care of can you do what everybody wants you to do, which is self-actualized in order to do this. And I said this is very much like data because everything that we talk about in data and that is public in data and that is advertised in data, all these solutions. If I was going to bother to update this slide, I would see Bitcoin and blockchain in here as well. These are simply technologies. I've already told you these are construction techniques that need to occur. They are the self-actualization, however, part of the Maslow hierarchy and represent just the tip of the data iceberg. Those of us that have been in the business for a long time know that if you build those technologies on top of good organizational capabilities, you have a much better chance of actually succeeding in this area. In fact, the second most frequent question that I get, first most frequent question I get on the phone, is, you know, Kim, you help me build a data architecture. Well, you already have one, so we don't need to build you one. You may need to understand what your data architecture is. And same thing here. Can you do this faster? Yes, absolutely. I can speed up. And if I speed up, it will take longer. It will cost you more. It will deliver less. And it will present greater risk to your organization than if you didn't. So these are two forced sequence. This is an unenforced sequence. The unenforced sequence is that people like to do it, but they tend to buy technology before they have capabilities. So we need to make sure we get those two pieces working correctly. One last example here on this, which is that most organizations start out, they don't have a formal focus on data. However, if you either improve operations or innovate, and by the way, those are the only two data strategies that do exist. Make your existing stuff faster or do something new. And then just a matter of what those details, of course, we don't want to do this. We don't want people to not focus on their data. So we might look and say, goodness, in the first quadrant to the right, we may have, I'll do that over again here just to make sure everybody can follow. The first quadrant over here on the right is really an efficiency and effectiveness quadrant. So everybody understands that Walmart is widely known for being efficient and effective in that area. So that's a focus on improving operations. If we pretend that Apple innovates on here, then Apple comes up into that innovation play. And I just want you to just stop and think for a minute here. I want you to take Johnny Ive, who's the erudite British guy that used to come on and talk about the new iPhone. Oh, it's going to be so beautiful. And I can't even begin to make an imitation of him. He's wonderful at it, but talking to be cheap, right, which is really what Walmart does. And it's like, no, that's a disconnect. It does not work. And similarly, I want you to take the Walmart guys who are really good at efficiency and effectiveness and tell them to be creative. And it's not in their DNA. Now, of course, trying to do both at once is even more likely to end up in failure than not in failure. Only one in 10 organizations is even trying to do this. So if your organization is even thinking about it, you're in a very rarefied group. And the key here, obviously, for most organizations is that you can start off by doing increased effectiveness and efficiency and use the money that you save from that to fund your super data initiatives from there. Let's take a look at a data leadership agenda. Most of them are pretty straightforward. In our book, we did three pieces. The idea of a data inventory, developing a first version of a strategy and starting to do some monetization work around that. The data inventory in and of itself is a big problem though. First of all, a CEO talking to a new chief data officer or data leader might say, can you finish it for me by Friday? Well, guess what? We're going to set a data. It's not a project. And I know of no organizations in the entire history of the world who have ever completed a data inventory. So it's critical that you reframe the question here to say how rapidly can we achieve the required capabilities? What sort of preexisting classification frameworks can we use to jumpstart the process? And how often do we have to reassess those claims as we go along the process? I have included in here. This is the safest thing for you guys to do as far as the data inventory is not part of our topic, but you have that as they slide to take away on this. Let's start now to look at how we're going to get better at this. The real key is starting off with leadership. And I hate to say this, but most CDOs don't work out so well the first time. What you see in general around this is that you need an organizational change agent. And so we've seen lots and lots of organizations that have tried to do this. They want to come in and talk and they don't have the necessary persuasion skills to help people to understand what it is that they really should be doing in order to do this. And it involves much more change management. Again, remember, go back five or 10% is technology. 80 90% is people and process issues, and that is the key. So we tend to say that the first chief data officer and most organizations are going to take one for the team. We're going to do some things get stomped on really badly. And then the second person is going to come in and say, who I'm glad you did that for me because that will actually be much better. Now, as we're paving the way, now we can start to do what I like to focus on, which are what we call life has projects. And again, the term, many of you use this as well. There are many things that need to be improved in our data environment. In fact, if you stop and think about it, our teaching in these areas, even at this level of the diversity level is challenging. Because I like to say most organizations need to get back to zero before they can start making progress. The problem is we've taught everybody how to make progress starting at zero and going forward. We have not done as much as we need to in order to help people get that rest of that process done to get us back to zero. Well, this is one of the best ways to do it. It's a focus of three things. What is going to help some tangible aspect of our organizational strategy? What are the existing baseline measurements that you're using to make sure things are happening in the right way in your organization? I've worked with companies that have worked on one strategic measure. I've worked on some that work on a couple and I've worked with some that actually like to maintain thousands of them. Thousands is much more difficult, but pick something here that is going to clearly show an improvement in the organizational strategy. Also then, look at the opportunities to improve data that's being used by the business. And when I say that, of course, the business is using all your data. But let's find something that's going to make a difference to some business people. That's going to help business people understand a little bit more about why doing what we're doing will actually make their lives better because we're tied it to these strategic objectives. And the third place for this is to look over your team. What sort of data skills does your team have right at the moment and where do you need more work? I wouldn't say like we used to do in the old days, write it in the language that you've never programmed in, you know, because that's fun. In the data world, we actually do want to get practiced on some of those other pie wedges that we talked about. And if you look carefully, you'll find a really bright spot right there in the center that will help you to show you what you need to do to start working in this area. Now, let's do a little bit of math on this. Again, very, very easy. I've been working for Virginia Commonwealth University on and off since 1977. So you can imagine how long a time that is. And let's just take a number and say that it takes about $5 million to run a faculty member through a 30 year career. Right. So that's a good investment and I'm real pleased with it. I'm able to show at VCU that my grants and funded research and student project salary is well over $20 million. So it's fairly easy for them to look and say, huh, okay, I guess we're positive with Peter on this. And in my collaborations, which haven't cost the university anything at all. I've got a documented $1.5 billion in savings in this area. So that's why people come to us is because we can help them to save money. But the problem is having not taught them engineering and architecture related types of skills. It's an issue. So architecture is used to create and build systems that are too complex to be done by engineering analysis only and that technical details in an architecture diagram are the exception rather than the rule. The engineering, of course, develops the technical designs and builds the things that we need to have so that we can then employ things like manufacturers, building contractors, et cetera, et cetera in the process. However, lacking these concepts makes it very difficult for many managers to understand what we're talking about. So part of the education process is just showing them these little skills. And my favorite example on this is I've got a blender. And if I've got a blender, it works really well for a family, maybe of Thanksgiving dinner or a big breakfast or something like that. But you're certainly not going to feed 40 troops on it. You're certainly not going to do day after day with that same blender. You need a different kind of engineering skills. I've moved into sort of a military context on this. Let me tell you one of the more important aspects of what we're doing is storytelling. And you need to be able to be good at storytelling because storytelling is how people understand what you're doing. So here's a story about military suicide. Now, if you're not aware of this, unfortunately, even in today's environment, more military personnel are suffering harm from themselves than they are from the bad guys. That is not a good state of affairs. And I happened to be working at the Pentagon when this was coming down. So my company had a contract. So they said, it's your problem now, Peter. So we started working with what I called my Council of Colonels. And the Council of Colonels, we ended up developing this 30 by 30 matrix. What we were trying to do is to say, you know, Colonel X who has this data that's been done for this purpose. So you could put a check mark in row 12 column 10 and a check mark in row 18 column 25. And that's how you can legally use my data to save soldiers lives. Well, I had a chip that I could ask the Secretary of the Army to come to one of these meetings and he did. And when he came to the meeting, he said very simply, I know you guys are trying to do the right thing and remain compliant with all the guidance that we've given. How about if we call this all my data? And I'm authorizing all of you to use my data to save my soldiers lives. Is there any question? And of course there wasn't. You don't get very many opportunities to do something like that, but it was a huge important step. And I'm going to tell you all another story too, which is that unfortunately I've told that story to more than 100 corporate CEOs and not a single one of them has the guts to do that. It's really quite pathetic. I, in fact, even wrote this up in a book, but, you know, saying that somebody owns this data and somebody owns this data and we have to be careful for them. I've seen caused time and delays to absolutely no end in in this scenario. So, absolutely critical that you've got the ability to tell stories and I've told that story and I've told everybody else to continue to tell that story. And I know it gets around because I've gone to places and people say, oh yeah, I heard that story for you in order to do this. Did we save soldiers lives? Yes. And most importantly, we focused on the correct goal, which is saving soldiers lives as opposed to trying to remain compliant in here on this. Let me change to another component of this as well. So, you may or may not be able to hear the music in the background here, but believe it or not, Bruce Springsteen. And the story that goes on this, which is that Bruce Springsteen was flying his band into Australia and he wanted to do something that was very nice for the Australians. And one of the things he said was, well, why don't we play a song that's an Australian song? Again, I don't think you can hear this, but. So the key for this, this song Staying Alive is the song that I hated the most in 1977 when I graduated from high school. I couldn't stand this song because this song represented disco and I'm sure you're saying, what does this have to do with data? Well, in order to do data properly, you need two things. You need the good material. Yes, absolutely. By the way, Staying Alive was not the first song that the BGs had written. It was actually about the 140th song that the BGs had written. So they've had lots of time to practice. Sounds good, right? And that's really the second part of this. You've got to be able to practice because if you start out and try to write the world's best pop song on the very first try, I'm sorry, but it's going to end up in failure. The only way you get to Carnegie Hall is by practicing, practicing, practicing. So let's try to pull a little bit all together in all this with a takeaway and I'm going to use a specific takeaway. This is out of the monetizing book that came in here and was contributed by somebody who actually wanted to contribute this and was able to get permission from their bank to tell the story. So here's the story. Please add a new field. The new field E is a combination of A plus B divided by C. All right, no problem. Now A is sourced from one of six systems. B is another customer record sourced from one of the six systems and C is data provided by a vendor. Okay, so we've got some pretty good things. Let's take a look at the data challenges. Some of the lungs were missing from field A. Some of the other stuff was missing from field B and some of the stuff was missing from field C. So when they went in to try to populate the system and do all of this work, they created routines to grab the data from the disparate loan systems. No problem. And we're finally ready to go with user acceptance testing. I think Kevin's been talking about that one as well. So the process manager, in this case Linda, who's the person that did this, she says, how many loans were we able to calculate that new field on? And the response was 43%. That just doesn't sound like the right number, does it? Why only 43%? Well, we still have missing and bad data. We've resolved everything we could, but we don't have the required fields. We know it sounds low, but guess what? All the requirements are satisfied. And Linda, of course, is a business owner, says that may be true that all the requirements are satisfied, but if you're only going to populate that field for 43%, I think we're undercounting here. And so she authorized an extension on the project in order to get going, provide detailed metrics, not typical of the developers. Again, it's a data problem now, not a software problem. And they started to come up with this who, like I said, did not agree that 43% was actually able to do it. So when they used alternate fields, when A, B or C were blank, they discovered different values. They were able to take that number and make them from 43% to 88% more than doubling that particular piece. They used every scenario that was unable to account for it. And when you go back and look to what actually happened, well, we're measuring success differently. Right, what we're doing here is the same project, same process, but in this case, different measures for success. By asking if our data was correct, they were valuing the data more than they valued on time and within budget. And by valuing the data correct, it was more than process. By auditing the data rather than auditing the project documents, this was worth $50 million annually. And Linda is super happy to chat with you guys about these numbers. It is a wonderful story and just one example of how we can do this. So the last one here was kind of, you know, we're already in the middle of it and we want to just change what we're doing. What we really need to do is take a step backwards here. So let's just take a quick review as we head up to our question and answer session on this. Your business case to express data improvements as a business outcome must show some form of leverage. If in Linda's case, she could double the number of loans that she was calculating this new field for, that gave her a $50 million upside. And I'm pretty sure the project did not cost $50 million in order to do this. You've got to be able to say, we're leveraging this data in order to, and again, you're the ones that can fill in the blanks here because you are the data experts for your local organizations here. And this refocusing around business outcomes has to have enough information in the challenge that says subscriptions are going to go up, expenses are going to go down, whatever it is you're trying to do, and then you have to show them that as a result of the data because this is a two-stage process. You're simultaneously trying to fix the problem, but also trying to convince people that if they approach these types of problems with a more programmatic approach, you can take a team of people who can build skills in this area and get good at it. This is the reason for your data program. And what are we trying to do? Well, again, IT is not known as a success. Our numbers show that one-third of all IT projects succeed with full functionality within the cost and time schedules that they were originally taken on for. The sequencing that I'm describing here is mandatory. And the fact that we've been trying to avoid doing the sequencing, trying to figure out user data requirements in the middle of an agile sprint, as somebody was noting in the thread there, absolutely is just crazy. If your data requirements are incorrect on your agile software spring, you need to reach up, pull the ripcord, stop the bus, whatever analogy that you want to do, and move into a different piece that you can code with more correct stuff. Even our CDO leadership, our data leadership agendas are problematic in this. Again, we've got to have leadership. We've got to have a program. You've got to make the math simple. If you don't make the math simple, nobody's going to get it at all. You've got to get people that understand these engineering and architectural types. But you've got to figure out a way to do it in storytelling, and then you've got to practice the storytelling pieces as we go. So we're right back at the top of the hour here. I have a super offer for you guys. Any of the books that we've talked about here, Data Strategy CDO during your monetizing, are 20% off if you go directly to the bookstore that I've got on the loan next. And that's what we'll open it back up for some questions. Shannon. Peter, thank you so much as always for another fantastic presentation. Just to answer the most commonly asked questions, just a note, I will be sending a follow up email for this webinar by end of day Thursday with links to the slides and the recording and anything else requested. And then if you have questions for Peter, feel free to submit them in the bottom right hand corner in the Q&A portion. So diving in here, Peter, if you want to quantify, if you want to quantify the benefit, if I spent X amount of dollars to get the amount of benefit, how do you quantify the benefit? So one of the easiest things to measure, and I didn't put this book in the presentation, but I'll go ahead and give a shout out to it. Now, my book for inspiration in this area is a book called How to Measure Anything by a guy named Douglas Hubbard. So even though I'm sitting here flugging my books, I have actually sold way more of Doug Hubbard's books than on my own because he came up with that particular topic. Now, let's just take an example of how that would be the case. I did one example where I was working with an organization that was spending lots of time on clerical tasks, and they challenged me on a clerical task to say, you can't save $10 million in a clerical task. And I said, oh, okay, I love challenges. And since you're paying me and your checks clear, I'll keep trying that challenge, right? So the challenge around that was to look at how many clerks were doing duplicative work in these payroll clerk functions. And we looked at it across as an entire organization across a multi jurisdictional set of constructs in there. So long story short, a lot of people. And in order to do this, we were able to save $10 million by just eliminating a requirement that somebody do duplicative tasks that were there on this. The key is to look for people who are doing things and that they shouldn't need to do. And I'll give you another example in there too. One of my favorite examples of an oopsie in this area was a regional hospital chain whose director had looked at the admissions data and said, goodness, we're doing lots and lots of knee surgery. So, you know, we should invest more in knee surgery to show people that we're going to become the knee surgery capital of the Midwest. And then somebody finally pointed out to them that knee surgery was the default admission code in this particular hospital. And while the data said the right thing, the clerks who were inputting the data at the very front were not paying the slightest bit of attention because their goal was speed and not quality in this. So, if we take those clerks and say, do you really want to take the time to check a person into the emergency room? Excuse me, emergency department, we're supposed to call them. Take that person and delay them by 10 minutes while we ask them a bunch of questions. No, in this case it was not. So they did it the other way. I've sort of glossed over a couple of things here, but it is very, very, I don't say easy, but it's easy to get good at it by looking at what people do. So, for example, I can remember one, one other scenario we had where we justified an entire piece because two vp's kept coming into a high level executive meeting and one would say the sky is blue and the other would say the sky is green. You know, and they both believed it and they both had the right data. And so to avoid embarrassment, they were, they were told to have another meeting prior to that meeting and, you know, get the story straight before they came into the second meeting. And we just put the cost of that meeting actually ran into almost $100,000 if it was done every week in this case for a year on this. So, how is it that you go about measuring people's time? Well, we've done time and motion studies forever. And if people are doing something and then they're having to repeat the process where they're doing something they shouldn't do, you can look at these and make some numbers. You don't even have to ask people how much they make. You can say an entry level clerk or as I showed on my spreadsheet in there. In this case, a low level contractor at the Defense Department was $60,000 a year fully loaded on that particular example. Find things. If you can count, if you can observe it, it's countable. This is Douglas Hubbard talking. If it's observable, it's countable. And if you can observe it and count it, you can put a dollar value on it. I'll be happy to take questions on that because that's probably a little bit more detailed and we're going to get into it in the presentation there, but sure. Hey, thank you for the question. All right. So how to engage if there's a lack of interest and understanding. We hear this question a lot. Fantastic question, right? Why is it that management should give us any time or day? And I'm going to tell a story here from Michael Gorman, who's just one of the better storytellers in this area. He worked for the MITRE Corporation for many years and he was working on an internal project and he had an executive that he needed to get some time to and I'll tell you guys what the specific question was. Michael was doing a model and he had to find out whether a project in the MITRE Corporation could be owned by multiple departments or by only one department. Those of you with the data background on this know that's an extremely important question that needs to be answered before you build any software to try and address that particular problem. And the executive wouldn't give Michael the time of day. I go to do myself with technical issues, right? Well, guess what? Michael went and found the programmer who had actually built the database where it was in there and asked the programmer and she said, yes, of course I can do it because I knew as soon as I made one project can be owned by only one department, somebody would own an exception. So I built a backdoor into it and created that capability in the database native in and of itself. So the next time Michael got on the elevator, the executive looked over at him and said, hey, aren't you going to ask me to see you? And he said, no, I found somebody who's more important than you. More important than me. Who could be more important to me? He said, well, it's the person who's programming the database. He said, well, why is that? He said, because I got the answer. You can have a theory all you want, but whatever's in the code is the actual reality of the situation in there. Believe me, that executive was interested from that point onwards in what was going on. Now that's a little social engineering technique. The shorter answer to your question is that if you don't tie what it is that you're doing in your organization or in your project with something that is going on in that the executive cares about. There will be absolutely nothing that happens there because executives are busy people. They've got five minutes to look at this and 10 minutes to look at something else. I'm going to pull this slide up that I did on there. Hang on, let me get it so you can see it. Up with our button. But anyway, let's see, we need to make that smaller. There we go. On this slide here. The key is the X's that are data things that are happening and then organizational things happening. That's the area we've got to focus on. We have to find what are those links and good business analysts can find those links if you ask them to do it. But if you don't ask them to do it, they won't. And it's not part of our training. It's not part of anything that we do in data that actually shows these things. Shannon, I hope my screen is showing correctly, right? It is. We see the implementation. I mean, we see that. It's not a big deal. So, again, great question. I hope I answered and if you don't make it relevant, if you don't make those organizations happen in the corner office, they're not going to pay attention to it. And quite frankly, they shouldn't. So, it's up to us to learn that one. And I see a lot of great questions that have come through the chat. And if you have additional questions for Peter, feel free to submit them in the Q and a portion as well. Make it a little easier for me to find. So, but looking through here, Peter, how do we measure how much better organized data is an unorganized data? That's what my stakeholders need. Oh, that's a great question. So, while it's probably unlikely that anybody's going to hand people pages of a book, and then, you know, what do you like better putting the book together and then actually reading it or just reading it. Most people are going to tell you reading. So, the question is, how can we figure out how organizations can put a value on people's time? And the easy answer is we pay them. Right. So, again, if I'm making $10 an hour and it takes me an hour to do something, if I can do more things in that $10 expenditure, we're going to have a much better. Experience than if I'm, you know, stuck at the slowest rate in order to do this. Shannon read the question for me again. I think I forgot an aspect of it. Oh, you didn't have a question. How do we measure how much better organized data is an unorganized data? The stakeholders really need to understand that. Yeah. There's a couple of things that we can look at. And again, I like to break it up from people process and technology. Let's just start with the technology, which is the easy one. So, if I've got data that's poorly organized, it means that my queries may be slow. And if I reorganize the data, I can actually make those queries faster. I did reference that on one piece. There was literally a company I worked at that had a transaction that ran a billion times a day. Added up eventually to something that was more worthwhile on that. The question is, what are you spending now? Do you guys remember back about 20 years ago when people were trying to bring Macintoshes into Windows organizations and everybody go, oh, gosh, we can't get our IT people to support a Mac? It's a mixed environment. They'll have to learn two sets of skills. And interestingly, NASA of all groups went off and did a super academically refereed study that proved beyond a shadow of a doubt that every Macintosh you added to your environment reduced the total cost of ownership of your computing environment by a very significant amount of money. And so organizations, as you probably have noticed, have said, wow, you know, let's start bringing these other pieces in. Well, the only way they could get those numbers is if they got to the total cost of ownership. We'll see that abbreviated as TCO, the total cost of ownership. And if we start to add those things up, we'll also not have the right number, but at least we'll have something written down that we can start to argue about. And that's the important part is to start these dialogues going around this, because if we can't convince people that doing something better with data is going to produce better results for us, we're going to continually run into the situation that we have that IT projects will get priority because data comes out of IT and that's what everybody think produces the data. Again, thank you, Shannon. That was nice. Indeed. So scrolling through here. Again, for some additional questions. How do you distinguish process and practice? I'm not sure. Yeah, I'm not sure. So in our recent DNA strategy redux, the quote unquote bold and triangle is people process and practice and then then the rent. I'm reading. I'm reading the wrong question here. I do that. Yeah, I'm trying to go through the tie here. One message. So there was a question to everybody, but Peter, you may be able to join me in here as well. What methods does do you use to define business value of doing something? I would not so much a method, I guess. Although I like the word method better than methodology. The word methodology actually means the study of methods. So if somebody comes to and says they're going to give you a methodology, you probably have a sleep problem. Sleep better. I'm just going to say, listen, I know that's an oversimplification. But the cool part about being able to go visit as many organizations as I've done in my career is that as soon as you arrive, you start to hear things that tell you what's going on. For example, I arrived at one organization for one of my immersions that you referenced in the introduction, Shannon. This is a company where I'm going to come in and I'm going to be part of the team for a couple of years, working on some very, very interesting stuff, working shoulder to shoulder next to everybody else. And this organization spelled my name wrong when I came in the very first time. I just sort of, you know, got, wow, you know, they didn't even bother to double check the name. And when I, you know, said, hey, it's not correct. They said, well, that's just tough. It's in the system now and we can't take it back out. I said, oh, so everybody's going to look me up by the wrong name and, you know, blah, blah, blah, blah, blah around this. That was an indication that that organization had some serious master data problems and guess what they did. They did indeed in order to do that. If we're looking at, let's make it very simple, all right, I've got a five person data team and each of the five people in the data team gets paid $100,000. Okay, just very simple. So that's a $500,000 expense that the organization is going to invest in data. And I feel it is incumbent on the data team to show every year that they have returned at least $500,000 plus back to the organization. Now, the first year they do this, they're going to measure some things and they'll get some things right. They'll get some things wrong. And in the process of doing that, they'll get some feedback and that feedback will help them to make a better model so that you can eventually make these spreadsheets. But that will eventually start to come along and show people what it is that we're doing. So if I have to do, I mean, again, a query that runs a billion times in a day, that's a very fast query and it runs an awful lot. Anything that we do to speed that particular process up goes with it. Let me add one more piece to this. Doug Hubbard's book was certainly one inspiration for this. Another one is LHU Golderat the goal. Those of you that have read anything on the data strategy area know that we base the data strategy book around the theory of constraints, which is very simply that there's one thing in your system that is blocking achievement of strategic objectives. Find it and fix it and move on to the next one. And again, in two seconds there, I've just given you the essence of what we try to do with these things. And the comment came in while you were in the middle of your answer there. So, you know, the comment is exactly training to know the business. And should it be? What are ways for data and IT staff to discover a learn organizational strategy? Ooh, so data governance provides a wonderful excuse if you're working on that. One of the things that I require all the people I work with as data stewards to know is what is upstream from them and what is downstream from them. So most of the time as a data steward you learn what's in front of you. But if you learn also where it comes from and where it's going, now your data stewards have much broader perspectives around all of these various bits and pieces. And this is where you can come into play and take a look at this. I'll tell you a very specific example. One organization I work with had an individual who is very specialized in selling a very specific commodity on this and, you know, think of it like sugar or something like that. And he had his own SQL server under his desk and he said, you're not going to tell anybody Peter that I have this, you know, non authorized system under my desk. And I was able to say, you don't have to worry. They know it's there and they know that you're a lot cheaper than whatever it would take to replace you in order to do this. And he said, oh, really? And he went out and he priced the system, looked at, you know, like an Oracle suite and things like that and how much training was involved. And he said, I'm asking for a raise. I'm only charging these guys, you know, X and they're clearly getting a lot more than X value out of me because the way other organizations get the same information is they don't rely quite as much on my expertise. And they have to get a lot of more, a lot more steps have to be involved in the process of that. I love it. So our GMO is a new team in our organization data management office. How do we start? What should be our first priority? How important to have the data management tool? So I would say the tool is the last thing that you want to do. And the problem again with the tool is it becomes about technology and also management looks at and says I've paid for the tool. Therefore, the project is done, right? So I would definitely take it back to right here where we started on this an hour ago, which is that find things that the organization needs to do in order to achieve its strategy. The big set of those things, pick a subset of it and take the overlap between that subset and the data that's kind of questionable by the organization. You know, they sort of get it, but I've never really understood this data. You know, you hear a comment like that. Again, you have to find out whether it's one person or pervasive, but find a pervasive issue that's in the business. So the business will say, you know, things didn't used to work very well, but they work better now. That's what you want to hear. And of course, that third area is, you know, where as your new data management organization, where have you the least practice? When I say practice, I'm talking about the areas of the Denbach wheel. Again, if you're new to this, one of the things that people don't get from the Denbach wheel, something I would like to have had in there. And I'm not smart enough to figure it out. So maybe this is a challenge I can throw out to you guys. The Denbach wheel here fails in two areas. They both have to do with data properties. They don't show optionalities and they don't show dependencies. So wouldn't it be nice to say, although the diagram kind of does, data governance is a part of everything, but that's about as far as it goes. So for example, we might say, you know, it'd be really good if you got metadata management down before you try to build a data warehouse, as I did in my little story here when I was telling you these things on that. So the practice areas are not necessarily things that you should do because they're on the wheel. They should be things that you do because they improve the organization's ability to achieve its strategy. And that's the key criteria. So in DEMA, even though I was a part of the group that put this together, this is perfect hindsight. I'm coming back and critiquing it and saying it's a great effort. But it would have been even better if we could figure out some way to say, hmm, maybe there are some things that you should go first. Maybe you shouldn't start your data management project on the most difficult of these challenges that we have around here instead. Think about it, it works the same for music, right? You don't send your kid and say, hey, you need to be a Suzuki violinist or, you know, a wonderful pianist. You start out by playing chopsticks. I'm a terrible person to do that. I apologize for breaking all your eardrums on that, but I hope that makes sense. Let's start off small, get practiced, and then start to get better at things rather than trying to start at the top by buying tons and tons of technology, which we see a lot, and then actually having it not work out well. There's a way you get to Carnegie Hall is by practice, practice, practice. Thank you for the question. You have a general calculation on how much budget a company should be spending on analytics today. No. Gartner keeps track on whether those numbers are going up or down, but they don't keep track of the numbers. So if you know somebody that's got access to the Gartner catalog, like an academic or somebody, you might send them an email and say, hey, could you look that up for me? But the actual dollar amounts Gartner does not keep track of. But they do, they do keep track of numbers that say, you know, next year we're expecting spending in these areas to rise based on our surveys. Without tools, how do you perform data lineage in Excel? It's really hard to manage. Again, fair question. One of the things we're starting to see is that a number of companies are now starting to offer these softwares of service pieces. So I don't know of a profiling company that's out there, although there may be one coming on the market. I'm going to find out at another month. I've got a colleague that's going to give me a briefing on this. But let me just give you an example. Those of you that know anything about me know that I spent a lot of my time talking about reverse engineering on him. This is something that we just don't do well. You know, again, reverse engineering is this process of looking at the existing systems or the existing capabilities and trying to figure out what are the good things about it that we should preserve. The bad things that we want to improve. And if you don't do a formal analysis on that, it's very difficult to actually have it done in there. Now, I see that reverse engineering, people often want to know what their process architecture is in there. In fact, I will go further than that and say it's impossible to implement a successful MDM project, master data management project, without fully understanding the organization's business architecture in there. The reason I'm telling you this is because most people are like, well, how often do I really need to do a reverse engineering of a business process? The answer is not very often. So you're unlikely to invest in that and get good at it. However, a company called, I think it's QR systems in Europe has a software as a service portal, and you can take the log files from any of your existing systems that you have and put them in this system. And it will go through and tell you, it looks like this step happens here. And it looks like this. Is it going to give you your process model? No, but it gives you a lot of great information about when things come in, what types of things need to be exceptions, what type of things can be handled normally. And we're looking to the profiling industry to do the same kind of thing. I think there's a couple of offerings that are going to come on the market where you're right. Spreadsheets are awful and unwieldy. And again, if you keep me talking today, I'll tell you the story of a company that almost went out of business because they screwed up their spreadsheets so badly in a legal case on this. The question is, how can we start to look at these various capabilities that we need to have without bringing them into the Microsoft Office suite of tools? And don't worry, Microsoft will have a profiling tool sooner or later, so they'll say, don't worry. This is how we do it in Azure, right? But good luck with that one. So, another thing that I've talked to companies about is renting these things. So, even though you may have a very, very, an organization that just doesn't have something that really will require data, but you need to clean up some data. That's just, again, that getting to zero rather than starting at zero kind of a thing. I'm starting to see some companies that are coming in and saying, yeah, we're not having much luck fail, so maybe we'll try to rent some of these capabilities. And the ability to come in with a very nifty little package that's on somebody's laptop that can do some things has proven to be very enticing. So there's a couple of companies that are working on that. But you're right, while Spreadsheet is not ideal to deliver this, it turns out databases are actually pretty good. So let's just take a metadata repository, which is one of my favorite things to help organizations with. If you're building a metadata repository, yes, you could go out and spend lots of money and there's some very fine products on the market. But what I say is, don't. Instead, build your own. You don't need much more than a couple dozen data elements and a very focused business problem. Again, I'll give you a very specific example here. We were called in to do one on a data warehouse project. And they kept saying, we'll have these requirements done by, you know, it's like a four month effort to get the requirements done. And so I put in the metadata repository, a measure of how many requirements were in the process. I know that sounds like a dumb thing, but we numbered all the requirements and we gave them a graph that ended up looking like some of the other graphs that I showed you in this presentation up and down over under sideways and things, which told us one thing. They had no idea what they wanted. The requirements were so all over the board that spending any amount of time and money on those requirements to implement them in the shape they were in would have been just throwing money down the drain, burning it, whatever analogy that you want to use. So there's an awful lot you can do. Another place to start out with this is in master data management. Again, master data management solutions are fine. They work just like the metadata solutions do. But when I walk the halls of your organization and I hear people say things like, you know, couldn't figure out where to stick that new data that we just got. So I stuck it in the MDM because it seemed like the best spot. You know, already they've blown it. Right. And this organization of course had blown it, but you can build your own reference or master data stack very inexpensively using SQL server. The point here is to use it as if it was the big $10 million solution that you had purchased. But just do it as a side project. And what you'll learn about how to handle master data is so important. I've got a health care company now that's been on it for three years. They what they built actually lasted for three years. And now they're trying to come on and graduate to the next phase and they figure after three years of doing this, they're ready for a conversation with the vendors. They feel they'll be able to actually have a good one. But if you're starting to talk to them and your knowledge level is at an entry level point and their knowledge level is an expertise, you're not going to have a very good conversation there. So try it yourself before you go out and buy one. You'd be surprised at how powerful that technique actually is. All right, we've got just a few minutes left here. I think we've got time for one more quick question. You know, you've kind of touched on this in general. But how do you demonstrate the values specifically of data governance to everybody, especially when there's not that many problems with the data quality? Fantastic. So governance is oftentimes about compliance. And I've been fortunate to work in organizations that have big compliance problems and they use governance as their solution to solve them. So literally it was how many lawyers did they lay off? And that was such an easy thing for them to do because rather than defending against, you know, we keep doing bad behavior and it doesn't work. That's not good. Right now we can actually come back and say, hey, we didn't have this many lawsuits that we had to defend against last year. Find your business problem. Just like I'm urging the data stewards to look at what's upstream and what's downstream on the data that they're responsible for, that they are steward of. We need organizations to look similarly into this holistic process and say, you may not have it easy where you can simply say, hey, we don't need as many lawyers or we don't get sued as many times or whatever it is. But there are things that are around that. And if you're having trouble by all means, give us a call because we're happy to chat on some of these things. Again, very easy to reach out to me and you can just click a schedule with me by clicking a button. So happy to chat. Thank you so much. It's been a great, great session. The green will thank you, Peter, as always, and thanks to all of our attendees who are so engaged in everything we do. We just love it. Thanks for, for all the great questions. So just a reminder, I will send a follow up email by end of day Thursday with links to the slides and links to the recording as well as the other requests like the books from, from everybody else. So thanks everybody. I hope you all have a great day and stay safe out there. Thanks, Peter. Thank you, Shannon. Cheers, everybody.