 Hello and welcome. My name is Shannon Kemp and I'm the Chief Digital Manager for Data Diversity. I'd like to thank you for joining today's Data Diversity Webinar, Data Management Best Practices. It is the latest installment in a monthly series called Data Ed Online with Dr. Peter Akin. Just a couple of points to get us started. Due to the large number of people that attend these sessions, you will be muted during the Webinar. For questions you'll be collecting them by the Q&A in the bottom right-hand corner of your screen, or if you'd like to tweet, we encourage you to share highlights to questions by Twitter using hashtag data head. And if you'd like to chat with us or with each other, we certainly encourage you to do so. Just click the chat icon in the bottom right-hand corner for that feature. And to answer the most commonly asked questions, as always, we will send a follow-up email to all registrants within two business days containing links to the slides. And yes, we are recording and will likewise send a link to the recording of the session as well as any additional information requests throughout the Webinar. Now let me introduce to you our speaker for today, Dr. Peter Akin. Peter is an internationally recognized Data Management Thought Leader. Many of you already know him or have seen him at conferences worldwide. He has more than 30 years of experience and has received many awards for his outstanding contributions to the profession. He has written dozens of articles and 11 books. The most recent is Your Data Strategy. Peter is experienced with more than 500 data management practices in 20 countries and constantly named as the top data management expert. Some of the most important and largest organizations in the world have sought out his expertise. Peter has spent multi-year immersions with groups as diverse as U.S. Department of Defense, Georgia Bank, Nokia, Wells Fargo, the Commonwealth of Virginia, and Walmart. And with that, let me turn everything over to Peter to get today's Webinar started. Hello and welcome. Thank you for inviting me back, as always, Shannon. It's a pleasure to join everybody on this, at least in return, Virginia sunny and 60-degree afternoon. What a crazy thing for the middle of February on this. Just a quick that some of the materials I'm talking about come from these books and we did have some special event pricing available to you. But what we're really going to be concentrating on today are what are typically called data management best practices. And the challenge with that is that it's very difficult to determine what those are, in many cases, objectively. So we're going to start off with things that we can, in fact, determine. And what I'd like you to think of it, in spite of the fact that the title is, again, Data Management Best Practices, is that this is how you practice data management better. And this is from an organizational perspective. So hopefully that will resonate with everybody and we can dive in and get started. So I'd like to start out by just reminding all of us that there are four things about data that are currently true. One, the volume is increasing faster than our ability to process with it. Two, the costs of overhead and exchange are measurably sapping organizational and individual resources. And in fact, we can state authoritatively that 20 to 40% of all IT costs are tied up in unnecessary data manipulation practice around these things. Number three, that reliance on our existing approaches has not materially addressed this gap. And that means we have some challenges. And finally, we have to recognize that there is an industry type whose sole purpose is to extract these data items from our citizens and use it to make money. That last part will become more apparent as we increasingly require society to become more literate around all of these topics. So our program for today is we're going to start off with some motivation. And we are, should be vastly unsatisfied with the current state because we are simply not making progress. And then it is worthwhile to just dive back into history and acknowledge our roots on these things and some specific individuals along the way of where did these practices come from. And we'll talk about how the industry has been pushing and yearning for these. And in fact, in some cases now much of this guidance has been turned into, almost into law on this. We'll talk specifically about the two components of this, the data maturity model, which is out of Carnegie Mellon University originally on this, and the DEMOC or body of knowledge for data management. And we'll talk about how to apply both of these together and understand that we are still driven by data and by, excuse me, a weak link in the chain architecture. And that's important because it means you can't do data partially. It's a very much of a, you've got to be sort of in all the way. We'll do just a tiny, tiny bit on strategy and talk about the necessary three-legged stool for you to successfully do data practices better in your organization. It involves people process and technology. And guess where most of the emphasis has been? Technology, where do we need to put the effort on where managers say they need to put the effort on the people on the process side? Just how much? Hang on, we'll get to that. And then we'll finish off with some very progressed wisdom from somebody who wanted to get to Carnegie Hall. So when we finish up on this, we'll talk about what's perhaps up next for us as a profession. And then, of course, the part that I really look forward to is the Q&A part because you guys are such a challenging bunch to work with on this. So let's start off with how literate are we in fact? And turns out the federal government has been testing this for a while. Unfortunately, they changed names in the middle of the testing process. So if you're looking in history, you're looking for the national assessment of adult literacy. If you're looking forward, the next versions from this point on out are going to be called the program for the international assessment of adult competencies. And the key is that they look and measure and have been for a while three areas of competency. Literacy just in general, written word, et cetera, et cetera. Numeracy, are we any good at math and can we understand that one in one doesn't make 11 as I have on the cover there, but one in one actually makes two. And finally, digital problem solving. So as surrogates for data literacy, these are absolutely good measures. And more importantly, there is no significant difference during a time in history where data has increased at an increasing rate. This is generally not good for people. Another bit of good work that's been done is by somebody called the Data Literacy Project, which is a joint venture of click and excensure. And they've done some very good measurements around the world here and find out that overall 14 people claim to have a good understanding of them. And interestingly, the young people that are running around with these supercomputers with them, only one fifth of them classify themselves as being data literate. And what this means is that our future employees are incredibly underprepared for what we're calling data driven workplaces. 8% of companies have made changes in the way that data is used. So if you just start doing something, you're actually ahead of the curve already, but 90% of them of course feel that data is transforming the way that their business operates. Such a really interesting statistical decision makers, most business decision makers, only one in three feel they can confidently work with data. Again, same thing, one in three, we don't know if it's the same one in three, are able to create measurable value out of all this. 27% said their projects produced actionable insights, 80% of them are willing to invest more time and energy. And what this means is that decision makers in our organizations are data illiterate in a way that has been extraordinarily difficult to deal with and harmful to our efforts in these areas. Now I'd like to start off with just a little bit of a story around this. Another sort of a standard complaint that happens. Again, a manager, a CEO, somebody in charge says, my data problems, I have to get something fixed and somebody suggests to them or maybe they find it in the back of an airline seat magazine, it says data warehouse will solve your problem. And we're not picking on data warehouse this year. Again, the more important aspect of it is that this applies to most new technologies that are coming in place. So they go off and they buy a data warehouse and it doesn't solve their problem. Well, let's take a look at what tends to happen with that sort of a process. If we start out with a bunch of data and we've decided that we need that data in the data warehouse, typically the decision is made not as a plan but as a result. The technology thing takes longer and costs more and delivers less than it was supposed to. And so there's less time to work with the data at the end. And so instead of doing something good with the data, we forklift the data into the system. And the problems with forklifting the data are that there's no basis for decisions being made at all. There's no inclusion of architecture and engineering concepts. There's no idea that these concepts are even missing from the process. And the 80% of organizational data is wrought. Wrought standing for data that is redundant, obsolete, or trivial. So now we've blown up the data warehouse and let me pretend I don't have a data warehouse paper. Let's say I was going to put it in Salesforce, which we all know is fine software. But somebody on the other side of it looking at data from bringing through Salesforce is going to say to themselves, Salesforce isn't working the way it should. And through no fault of Salesforce, that is absolutely correct. So how should it occur? Well, hopefully as you take this guidance here, understand that going into a warehouse, into a new technology, a data lake, whatever you're trying to do, lake's probably not the good word for it. But actually you should think about it before you just throw data into your data lake. The idea is that there's a transformation point and that can objectively define something. So I have three characteristics of data in warehouses in the cloud, in other places that we put them. The first one is that the data in the cloud should be less than the data outside the cloud. Secondly, it should be cleaner. Because if it's not cleaner, what are we doing with that? And the possibility that we're going to introduce errors into it, of course, is rather high. And finally, it should be by definition more shareable than the data that's outside of it. And that will help the warehouse get the utilization as well as avoid the costs on all this. Well, I mentioned before what we're trying to do here is take things to the cloud, add it in. There's a three-legged stool, people process, and technologies that we need to put in place in this. And I'm just going to give you a little bit more set of numbers. This is a wonderful survey by Randy Bean and Tom Davenport. The link is right here at the bottom of the page. I encourage you guys to get a copy of this. He's been measuring the same questions over time. And this is one of the interesting pieces that he has discovered. So, sorry, I'm going to go back. I was totally incorrect. First of all, self-reporting, again, are you driving innovation with data? Thousands of companies, big names, very good survey. And he's been asking the same questions for a while. Less than half are. In fact, it's 49%. In 2019, it was 60% in 2020, it was 44%. So the numbers are varying a bit, but notice a pattern here. They're spalling as we're getting closer into all this. Is your company competing on data and analytics? Again, less 41%. Say they were last year, it was 45% in 2019, it was 48%. And I'm pretty sure that as you look, the amount of hardware sold is more rather than less, the amount of technology sold around these concepts. Again, are you managing data as a business asset? 39% Yes, 50% in 2020, 47% in 2019. So again, the numbers are dramatically dropping. I think this has to represent an increased awareness, because I'm pretty sure things haven't gotten that bad that quickly. Are you creating a data-driven organization 24% last year, 38% reported a year before that 31%? I don't know, it was 2020, an optimistic year. Let's not go there. Have you forged a data culture in your organization once again 24% last year, 38% the year before 31? Again, most organizations aren't doing this, and there's an important concept around this. Finally, the last question on this survey that I want to highlight here for you is the idea how the role culture plays. So in 2018, the same groups of people were asked how much of a problem is culture versus how much is a problem with people in process. The people in process on the right-hand side yellow there, and you can see it's 80% at that point. And in 2019, it was 95% in 2020. It was 90%. Last year it was 92%. The resounding message here is that there have been no significant changes. People at process problems are the things that are hurting our organizational data efforts. Technology is there. It's great. It works. Let's see what we can do to start to address that. Well, as I said, it goes back a little ways. When we were getting started in this business, many people would ask the question, we want to move our data management program to the next level. It sounds like a good thing. If you don't know what level you're currently at, then it doesn't measure it. So there's no way that you're going to be able to take anything that's a concrete program and put it in effect if you don't have the ability to measure the very basic piece. So there have been a number of people who worked on this, and I'm just going to describe my own involvement in here. I hope I don't leave anybody a significant out. And if I do, please get in touch with me because it's a great story. I worked for a while. I had the title US Department of Defense Reverse Engineering Program Manager. One of the things I did at the Department of Defense was that we sponsored some research at Carnegie Mellon University in a group that they have up there called the Software Engineering Institute. It's a very fine group of researchers and they do excellent work for all of us. When I was at the the question that we asked Carnegie Mellon to address was how can we measure the performance of DOD and our partners that we're working with in this area. And they came up with something called the capability maturity model. We're going to talk about that in just a little bit. Interesting level, I was also told to go check out what the Navy is up to because the Navy had some interesting work that they wanted me to go find out what it was. And I fortunately met Clive Finkelstein and John Zachman at that exact meeting and were able to develop lifelong friendships and keep them involved in the process here as well. The SEI responded with an integrated process and data improvement approach. And DOD told them to take the data part out because their name was the Software Engineering Institute. I know that sounds crazy but yes, that actually did happen. So I was up at Carnegie Mellon working with some of the folks from the SEI and somebody said to me, hey, Peter, you like data a lot. We've got this research that we did for the Department of Defense that they don't seem interested in. So would you like it? And I of course said yes. And it eventually grew into the CMMI as well as the DMACC in these areas. The person pictured on here was a very dear friend, Bert Parker, who put this on as a MITRE internal research and development project. So collaboratively he worked with myself and a number of others here you can see on the paper we published around it. The idea was could we use the pieces that they had started up there to develop something for data. And we came up with the same basic structure that you'll see here today. It was a normative model that was required to understand scope around data management activities and help organize key data management practices. Generally this is the results of this have been two actually three official studies done out of here in just the data area and lots and lots more worked into the CMMI area. It's reported as not well done. Just very same as Randy Bean's survey looked at the blue and yellow stuff on it there. So CMMI as an institute turned out to be a group that the University, Carnegie Mellon University spun off as a for profit agency. Again the FCI as I mentioned before is this federally funded research and development center and has done really good work over the years. The CMMI Institute was put together to commercialize this process. And it was sold to a very fine group called ISACA. And they're working on all sorts of activities around all this. But let's go back again and look at what happened while we were at ISACA. And I want to highlight my colleague and friend Melanie Mecca here who was the former director of this planet product and is the principal author of the data management measurement process. She's been in this for a long time. Many of you also have known her and seen her and encountered her on this. She's got a new offering now that she's doing which is called DataWise on this. And the book that she put together is a really fine work of art. She was given the absolute luxury of being able to work on this for about three and a half years with sponsorship from Microsoft's Lockheed Martin and Booz Allen. There were 50 contributing authors, many reviewers that the collection of practice statements and work products here. And this gives you the ability to evaluate how organizations are doing in this area. It is structured around poor process areas. And each process area comes under a category. And the material in the book that you can buy, I believe on Amazon, is organized into a purpose with introductory notes, talks about a goal, there are core questions around it. There are related areas because we did need to pay attention. These are not operating in functional stove types. There's some specifics around each of the practice levels in there. There's some example work products and this model is repeated for each of these. So you get a wonderful set of consistency from here. But also the other thing that Melanie did that was so fabulous was that she emphasized on behavioral aspects, how you would actually understand and see this. If somebody says, this is what's going on, then you should have some evidence there that you're able to see all of this. Wonderfully there hasn't been mentioned before a lot of research around this. And for example, compare the CMM approach in general against ITEL, RUP, COVID, PMI, other types of models. It compares very favorably in this case in terms of on-budget delivery as well as on-time performance. This is probably the most researched improvement method out there. And so in our book that we put out a couple years ago on data strategy, Todd and I wrote, while all improvement efforts begin with an obligatory assessment phase, this one, the CMMI and DMMM program, are the only proven frameworks that have added benefit of literally decades of practice and benchmarking data that is useful here. Anybody tells you you don't want to do anything else, you won't be able to necessarily meaningfully compare results against other organizations or with unproven methods. I mentioned the DMM also, the DIMBOK is something else. Now just a little bit of history on the DIMBOK. The first version came out of DIMM International around 2008 or 2009. I forget exactly what was a great effort. The second one is even better. And we've been able to put together now descriptions where people say okay, what does it mean to? And we're being looked to as an authoritative source. We called it the BOK because the Project Management Institute put in their own BOK, the body of knowledge, and there are many others. There's a SWEBOK for software engineering so why not we should have a DIMBOK? Well, again, marketing's probably not our best suit on this one. And let me, without implying criticism, just suggest two important improvements that we probably need to make on the next visual representation of this. The first one is that there's no sense in here of optionality. So people look at this and they say the DIMBOK says I must do content and document management. Well, no. DIMBOK says it's part of data management but we weren't as articulate as we really wanted to be. So this should be marked as maybe, right? Another component here that would be really super helpful is the idea of dependency. Where should one start? And if one doesn't know exactly where to start, it becomes difficult all the way around to do this. I'm going to go back to that slide here because I hit the button again too fast. The key here is that not all of these things need to be done for every organization and that putting them in some order, just the same as cleaning the data before you put it into Salesforce will make a very significant difference in your organization. Several of these topics also have ordering or dependency recollections. Now one final component here, and we'll get to this a little bit more. It's generally not the case that you only work in one edge, but it's also the case that you probably shouldn't try to work on all of these pie wedges at the same time. So the answer is somewhere between 1 and 11 are the number of things that you should try as an organization venturing forth into these areas. The reason these two are important is because the U.S. federal government recently passed a law called FIPA that was in 2018 and there's another whole set of topics that we can do on that or I've got a white paper and I'll put the white paper into the end of this seminar so you guys can all get a copy of that. But the interesting thing about it is that it's now illegal in the U.S. federal government to not use best practices and when everybody looked around and said I wonder what those best practices are, it has just become vernacular now for one-third of the economy that the DIMBAC and DMM are the best practices around this. Now that doesn't actually help us in many ways trying to figure out exactly what we should do, but let's take it forward and see how we can actually use these two in concert perhaps. And the idea of course that we want to understand them and apply them together and I don't get to do this very often but that is literally where I'm talking to you all from. The only difference is if you were standing where I took the photograph at this point there would be a barn between you and the house down there in Central Virginia where I'm at today. And I took pictures of this barn for a very specific reason. The key was we borrowed money from the bank to put the barn in so the bank gave us exactly this much money and said that before further construction is allowed to proceed we want to make sure that you show us that you have passed a foundation inspection. Why? Well, because banks are kind of smart and they know that if I as a horse husband have a choice of paying a horse vet bill or paying a barn payment I'm going to go with a horse vet bill in order to make sure that works. Now if I build a poor quality barn on top of a good foundation that's not good, but if I build a good barn on top of a poor quality foundation that's not something I can easily fix and unfortunately the same thing needs to occur in IT and generally does not occur in IT. So this foundational concept is really critical and I mentioned the weak link in the chain just a few minutes ago. I'm going to tie that back together now. Many of you remember Maslow's hierarchy of needs from perhaps high school or something where we start out with the basic physiological needs food, clothing, shelter. We don't have those physiological needs met. We can never be safe. And if we're not ever safe then we never have the possibility of developing love or intimacy feelings around these areas and this sounds a little bit psychological here but bear with me. If we don't ever belong to something that's bigger than us then we'll never know how we are as apart from other and distinct from other individuals in this case and therefore we will never get to what everybody kind of wants to get to which is self-actualization. Now this is also called and relabeled flow in today's environment. If you're in the flow you're really doing your self-actualization stuff. For my little Beagle friend here her version of self-actualization is chasing deer out around the house in this case. Well what does this have to do with data? Unfortunately way too much. The things we talk about in the golden data triangle which I'm labeling advanced data practices are very much technology based. When you talk about MDM you're talking about a typical implementation of master data management. Remember we think we're important but mobile device management is what that means to the rest of the world out there. And all of the things that would go into here even new things like blockchain again are wonderful but they require us to do some level of foundational data management practices. These consist of these five areas that are the five areas that you'll see in the data management maturity model. Governance quality strategy architecture and operations and these make the things in the golden triangle better. But these things at the bottom level of the pyramid here are capabilities that your organization generally doesn't have uniform amounts of. So we get a call and somebody says hey can you come do it faster and we say well if you speed things up it will take longer. If you speed things up it'll cost more. If you speed things up it'll deliver less and if you speed things up it'll present greater risk. And yet that is exactly where most of our organizations are. So now I'm going to transfer these foundational data management practices over here to the data management model. I think I was talking about before that Melanie did such a good job of developing. There are five areas here manage the data coherently. If we don't have a strategy what are the chances that everybody's going to happen to do the right things at the right point in time? It doesn't happen by chance. Governance, we've been working in data governance now long enough that we have a class of professionals that we can now start to include as data governance professionals and we have some history and research about what's working and what's not. We're looking at it with respect to the processes that we're doing with the life cycle looks like etc etc. We're doing this with hopefully sit for purpose data and we're doing this with the right set of technology structures. These are the five foundational practices that you need to have in your organization before you attempt to do advanced data practices. If you already have the advanced data practices you are doing some of this but some of it is likely being done unconsciously and without understanding of the role that it's playing in here. We also need to have supporting organizational processes in here to take place. And now we need to get into the measurement part of this. The measurement part talks about how are these performed and there are five levels for each of the performance levels. The first one everybody gets one point for an initial piece. If you have a pulse you get a point. Second level up, if your process is managed, by managed what we mean is we know that maybe your process for managing your data is give it to Peter. That counts as two points because you're going to follow that process repeatedly and that's important. The third level then is defined that does documentation exist describing the managed process. Generally any form of documentation is the link that we look to when we're doing these evaluations. Most organizations do not have any documentation. That tells you where most organizations are going to fall on this particular scale. The fourth component of this then is that you have the ability to measure these. That you've defined specific things. Peter, how long does it take you to get from this to this? How long does it take you to develop a database schema? How long does it take to correct a data quality problem? All of these are important questions and the ones that you should select to measure are the ones which will help you achieve your data objectives faster. The last level of all of this is that you use those measurements from the defined processes that you have now created to try and come up with a better process, a better way of doing all of these pieces. Again, just very briefly, everybody gets one point for having a pulse. If you have a management system around it as in I know what to do, then that gets you two points. If you have documentation in general, that will get you three points. If you take measurements about aspects of your process, that will get you four points. If you are optimizing these things by looking at your measurements and deciding what you can do better and what perhaps you maybe ought to look at doing in different fashion, these are all problems. Underneath this is this week linked in the chain architecture. Now I'm going to create a completely artificial scenario here where I have a company that has a three in data governance, a three in data quality, a three in data operations, a three in platform and architecture. However, they are lacking a strategy. That one there causes the rating for the entire group of these measurements and therefore the reflection of this organization's perspective to come up as a one. This is what we mean by weak link in the chain. It's kind of like getting a D in grade school sometime and being labeled a D student in college and university or even high school or rather. I hope that's clear. If not, we'll do more at the Q&A section because what I've described to you now on the left-hand side here are the components and on the right-hand side here are the stair steps for five areas. We can put these together in ways that now give us samples. This is the sample diagram from the original piece here and it's just showing a hypothetical organization that is scoring at different levels unless you can see the three has been circled and there aren't too many places that it comes above the three. Of course, there's at least one in here therefore the score for the entire organization would be a one. Again, it's a harsh score but that's okay. The other part that's so important about this is that Melanie has kept statistics that she can use to compare. Again, this is a hypothetical one so there's nothing actually here that you're looking at but she is able to tell us in general when different types of industries have reached sufficient volume of sample that we can draw some conclusions on. This is one I did. Mine was nowhere near as comprehensive as hers but just for the insurance industry and at the time that we did this, which was a couple of years back, the insurance industry was not seen as performing very high in these areas. You can also put these things together in other ways that are kind of important for as well. Here's an instance where I was in front of a group of airline executives and they were looking at these results and saying, okay one, two, I'm not sure I followed. Maybe I didn't pay attention to what Peter was talking about and then I say and gentlemen, this is your competitive situation that you're facing here. The organizations that I had reviewed in the airline industry here were in fact compared against your scores and man this woke them up because now they know may not know what a one answer or two is but they know they're the one and the competition is the two. It becomes a little bit of a motivator there. We also compare them to airline industry against everything else. They're somewhat comparable given that but most importantly, this tells us that if we're going to make a data strategy, our data strategy should be focused on eliminating at least one and perhaps multiple of these ones and turning them into a two. The last example I'm going to give of how to apply this comes from the only company that I'm naming in this which is the International Finance Corporation, part of the World Bank Group, a very fine group I've worked with on and off over the years. In fact, they were the very first recipient of some of this research around this. They asked that their treasury be done and their treasury were pretty much at one function in their information systems group. Again, this is a couple years back so it doesn't reflect reality today but at the time those both were ones. The neat part was that the business was actually world class. Some of the highest scores that have ever been recorded on any of these surveys and the nice part about a message like this is that when you think you've got a challenge around this, you only have to walk down the hall and ask your colleagues what they're doing in order to get free advice around this. Yes, there's loads of consultants that will be glad to help you out but gosh, if you've got the practices internal, maybe you ought to look at them before you spend money on the external stuff. Again, we can still put in here the industry benchmarks in this case and the overall benchmarks to show but I think you get the story that comes out of this. In a number of different ways, we've been measuring these data maturity areas since about 2007 and key to this, they are unchanged, fundamentally unchanged and that is not good for us as a society because we need to do better. But now I've given you some focus that you can now start to look at and say, hey, maybe I need a combination of some parts of the dimmock and some focus from the dimm and we can maybe work on some things. And how do you work on things? Well, I've mentioned several times a data strategy is being a component of all of this. My definition for a data strategy is a pattern in a stream of decisions. That's an important definition because the way in which we approach data strategy is really a cyclical process. For example, one of the things that I always tell all of my friends and colleagues that I'm working with is to number your data strategy. Why? Because when you label the first one, they will look at it and say this is number one and they will immediately expect a two. Not maybe tomorrow but maybe next year when it shows up, why do we do from one to two? Well, we accomplished one. And the way of accomplishing this is to identify a constraint in here. And this is straight out of the theory of constraint by LHGolderat who did a fantastic book called The Goal. Some of you may remember it from being recommended to you by various people who care about you. Anyway, the key is you start off and identify the current constraints. What in our data management capacity are we not able to do such that we can use data to better support the organizational strategy? If we can, make quick improvements using existing resources. However, in many cases it will not be possible to make that work. So now you need to review this in context of all other activities to figure out what is the proper strategic alignment and make sure that you have enough support to actually do this. If the constraint persists then you're going to have to alleviate the constraint with some structural remedies that may include re-architecting both processes and data around that. And if you don't fix it, you repeat the process until it is complete. It's a very non-traditional way, but let's look first of all at a couple of strategies in this. We're going to pretend this is a military context and the good guys are on the left and the bad guys are on the right. And you're going to use one type of strategy in this type of an engagement. However, if the field of play changes slightly, you're going to use a different strategy if you're here and the bad guys are here. And you're going to change your strategy still further if the bad guys are there and you are down here. So this takes us to a wonderful conclusion that is best stated by General Dwight Eisenhower. In preparing for battle, I have always found that plans are useless, but planning is indispensable. And I would argue the same thing here, that our data strategy is an important component of practicing data management better. Because what you'll see is that we're all trying to make a better data sandwich. I haven't figured out a better analogy for this, so maybe one of you can help me on all this. A lot of these are user contributions. If we're trying to digest data better, there's a component of data illiteracy that pervades our organizations. I've given you numbers on that already. There's a varied supply of data, and that's not really digestible very well either. And different use of data standards, maybe indifferent use would be another way to describe some of the organizations on this. And what we of course want to do is make these easier to use or make a better data sandwich. Now the more automation that your organization relies on, the more you will understand the following deming quote. This cannot happen without engineering and architecture support. And interestingly, the quote is actually quality engineering and architecture work products do not happen accidentally. And more to the point they can't. Now if we replace these with data, it's no less true in order to do this. So I mentioned before that the idea was when you're looking at how to work within an area to help improve an aspect of your data that will most support the application of data to your organization's strategic objectives. It's probably going to be a combination. I'm going to use the number three here just because it seems to be a good number that I see a lot in the world, but your number may be completely different. But it does apply to that three-legged stool in this case. And the approach maybe to go back at the original data warehouse was to buy a data warehouse. But now the organization instead says, hmm, maybe what I really should do is apply some data governance, some data quality, and some discipline around populating and utilizing something called a data warehouse. Notice also that this permits the organization to essentially rack up a point in each of those areas. Some part of your organizational staff hopefully has benefited from the experience of doing some governance activities, some data warehousing activities, some data quality activities around a very specific aspect of we're trying to improve this for a very specific, focused reason. Normally I would pause here and ask if there are any questions, but we know we're going to have to wait and do that then. But let me just show you a couple different ways of doing this. Maybe that effort, that first initiative that you did there and put some time and effort in resulted in a new understanding and that instead of doing quality, maybe this organization should focus the next set of efforts around metadata management. Now notice what's happened here again. We've done the data warehouse thing now twice. Not that we've built two data warehouses, but we've taken a structured approach to improving the existing data warehouse or the construction of a new one. We've done data governance essentially run through two times as well. So we're hopefully more mature in those areas, but we've only done one time on the metadata management piece here. So again we can't wait and hope it's going to be perfect, but we do have some ideas in here and we should be getting better at this. Of course our third example is very straightforward as well. Again, maybe this time we've discovered that our problem wasn't really so much around those areas in metadata, but that we really needed to have the strategy associated with master data management implementation done. And again, we've had one X in each of those, but now we've gotten three X better at governance, three X better at warehousing. And through a conscious program such as this, it is possible to take what you have and work with a very extreme program that will specifically raise your capabilities in the areas of the base of the pyramid that I was describing a couple of slides back. The idea is of course that as you get better in these areas, you can do more with some because your organization now has more capabilities along with it in order to do this. Many organizations ask the question where should I get started in this? And I like to use what I call a lighthouse project metaphor on this. There's going to be all kinds of things that you can do with your data to help the organization achieve its strategic objectives. That is the only reason data exists as an asset. It's the only reason that a data strategy exists is to help further things that the organization needs to have done and hopefully further it in a way that only the data group can realistically contribute around that. There's lots of things out there you could do. Let's see if we can find the area that intersects that and an opportunity to improve a complaint that the business has about the data. The business may say something like that customer data is not of good quality. Well, maybe if we can find things in the customer data that will improve quality and that specifically allows the organization to achieve better strategic results through the use of improved data and data practices around there, that would seem to be a win-win in my book. If we want to get to the win-win-win which those of you guys that know may know that's the way I like to do it, we might add one more constraint onto that. This is the area that our organization has developed certain capabilities and that while we don't want to experiment with production, it might be nice to use this occasion to also improve some data skill capabilities. This would be a wedge on the pie and maybe to attempt to take the wedge on the pie from practice at a level one to a level two within the scope of this particular project. This can provide guidance and unfortunately, the emphasis of the score in a way that I don't think is necessarily reflective of the value of the score. But generally the more uniform your organization is with respect to its data practices, the more valuable that score is going to be. Again, just applying that score to something such as a university would be absolutely much more difficult than applying it to a manufacturing company that made this. Now, again, I said win-win-win is the idea that could we possibly find something that intersects all three of these and I can tell you dozens of organizations have and continue to do exactly that particular process. They're able to take what they have and find an area to improve data skills for a cohort of their organization and do that by specifically and measurably improving data that is used by the business that the business has been complaining about. So we're seen as fixing a problem. Yay, right. And let's demonstrate how those efforts collaboratively also improved the organization's ability to achieve its strategic objectives particularly from a data perspective. So this is the process that most groups have found that helps them the most get started on these things when they haven't had the ability to do so before but now some enlightened management comes along and says let's do some data better. What should we do? And again, you have to start off the process by saying let's not go out and buy blockchain. Let's instead start with the basics and try to get good at that before we actually start to try to get good at some other aspects of what goes on in here. Let me, I said we're going to finish this section up here with a little bit of how does one get to Carnegie Hall. And I have a short little and hopefully a little bit of entertainment for you as well. We're in a room, I would say please raise your hand when you understood this particular practice area. Now, that's a practice area, that's a song. And normally what I would be doing here is playing the song that I hated the most in the year 1977 when I graduated from McLean High School. And that was the song Stayin' Alive by the Bee Gees. And it was not at the time my favorite song, it's definitely not my favorite song at this point in time. But I'm going to tell you a little story about that song. First of all, that wasn't the first song that the Bee Gees wrote. It was actually in their catalog I think that somebody told it was around 150 that they had done. So they had had 150 other prior songs that they had put in place and tried out and achieved varying levels of success. And let's not be silly on this. The Bee Gees were the international superstars at the time in 1977. They just happened to have adopted the disco style that I really didn't like. Now the story here is that you're a member of Bruce Springsteen and the East Street Band. Or you are a part of the East Street Band that Bruce Springsteen is taking to Australia to play concerts for the Australians a couple years back. And on the way in, he says to them Okay, we're going to go play Australia. I think it would be kind of nice to start off our concert tonight with a local Australian song. Now don't worry, they also did ACDC. So that was on the other coast where they did the ACDC songs. But they decided to start this next day's concert. I think it was a Brisbane concert out with the Bee Gees Stayin' Alive. I've got the YouTube link to it right here. If you click on that link it will take you out there and you can see it. It's a rockin' song which tells Peter's little brain that Peter didn't really understand music back when he was younger, although I really still have my disco socks t-shirt around that. But this song with Bruce Springsteen in the East Street Band playing it is amazing to watch. More importantly they've only played it two or three times which means they haven't really, really practiced playing it live in front of an audience and they still know what they're doing. Why can this band do that? Because they practice. And they practice. And they practice. And then of course it's the answer to how does one get to Carnegie Hall when practices, practices, practices. So why am I talking about Bruce Springsteen and disco music to you guys when you're here to learn about how to do your data better? Well, it involves practice. We've got the sheet music. The Bimbok and the DMM have now been anointed, if you will, by the US federal government as standards. They aren't really, but they're good places to start. So it's an excellent starting place for your organization. And those understandings of those being really good places to work and lots of people working in that direction at the same time means collectively in society we can get better, but also within our organizations we can get better. Because of course you can't get good music with any of this unless you are all coming off the same sheet of paper. And if you're not singing off the same sheet of paper the music is not going to go well. So the idea is if you're going to try to improve your data management practices it is much more about officially practicing what you're attempting to do than it is necessarily sitting down and thinking about it. Most of the time when I'm asked to evaluate a group's organizational strategy for how they're going to practice their data better is that they've got this very large document it's kind of like trying to write the world's best pop song on the very first try. The product is not what you should be concentrating on. The process is what you should be concentrating on. And that process of understanding and examining your practice officially is what's going to produce you results much faster than any written guidance that you're going to be able to pull out in all of this. I've mentioned a couple times here that the people and process aspects of this are problematic. What do we have to do next from an organizational perspective? Well, first of all let's acknowledge that there already does exist a wonderful group of professionals with proven results who are able to actually do things to help organizations change in fundamental ways. And they are called organizational practice and change management leadership coaches. And if your organization is big enough it probably has its own internal resources. If it's not you can rent them and there are some terrific books. When you look at this group of professionals they're there to make change. And making change by the way is kind of about making it more difficult to do it the old way than to do it the new way. And you'll see that in this illustration that we've been using for years to diagnose organizational readiness. Again like the DIMBAC this has become standard part of our diagnostic toolkit. When I walk into an organization and I see that it has vision, it has skills, it has incentive, it has an action plan and I see frustration. I know that that organization is lacking resources. Similarly, when I see vision incentive resources and an action plan ready, I know that they're lacking the skills. And only of course when all of these things are present do you actually get change. And change is the most important impediment to shifting organizational thinking about data. We're not going to talk about this one a lot today but I've put together a key study that is now accessible for downloading and that doesn't cost you a thing. You can go down and read about it. I know I've been successful in this because I've had 12 organizations come up to me and say you captured our organization to a T with this. Now let's think about for just a minute where we are. Data volume is still increasing faster than we're able to process it. Data interchange overhead, other costs are measurably sapping our organizational resources. The reliance on technology focus has not materially addressed this gap and we're fighting against an industry whose sole purpose is to grab data and get money. So process around all of these is going to be more important results because we have to develop the organizational capabilities to do this. If we don't and we're reliant on fitting ephemeral types of activities, it's not going to work. Failure in itself is a lesson. I'll tell you a quick story, but this morning an organization that I'm very familiar with had a little oopsie. We handled it really well. It wasn't terrible we had the oopsie and we did a really good job on our being able to do this. But the interesting takeaway at least for me was first time technology probably isn't going to work well and your first data strategy, your first set at trying to improve your data practices probably is going to have some oopsies along the way. And the people on the process parts are simply not receiving enough attention. Best practices do exist and we can use this to help our organizations achieve their strategic objectives in a much more productive fashion. So we've spent a bit of time. Again I hope most of you are frustrated as well that we shouldn't be satisfied with our current state because we can and should do better. In fact we should have been doing better all along but if we take just a quick side step on this, one way of characterizing what we have taught people about data as they've gone through IT and computer science and computer engineering programs is that the data part is used to build new databases. So why would be surprised, why would we be surprised that the answer to most people when they come out of school and they run into a data problem they build a new database. By the way if there's a thing we don't need more of on planet Earth it's how to build more databases here. But that's the only thing we've generally taught them and we shouldn't be surprised that we have a current state that's not allowing us to make progress against where we'd like to go. So through a series of interesting accidents happy bumping into people and wonderful collaboration we can now talk about a set of things that are emerging best practices in the industry but simply say we're going to take an aspect of the data maturity model and demock and we're going to try to understand and apply them together. And yes I said one in one equals 11 over here it's obviously not true from a mathematical perspective but it is possible to design a number of different ways of approaching this. And the next piece that we don't really have a lot of work around is the monetization angle on this. And understanding that people can invest lots and lots of money into things and then still come up with a not understanding the weak link in the chain of the architecture. One practice it doesn't just pull your score down but it actually can cripple your organization in terms of doing this. But it is important to have a focus around this because all of your people out there learned how to do, by the way let's just, who am I talking about? All of your knowledge workers that's the definition of somebody who deals with data. All of your knowledge workers learned how to do data on their own because we didn't teach them with a single methodology and they didn't apply through the same kind of program in order to do this. So the idea that we need a strategy is even more critical if you think beyond just the typical applications around this. Again just entirely too much focus on technology. Technology is great. We can do wonderful things with it but it is not going to solve these problems by itself and you saw that 80, 90% of organizational problems are recorded in the people in process areas and not the technology area. We've got to get this back into the colleges and universities. We've got to change the focus of the accreditation bodies. We have to change our understanding of the role that technology can play in order to do this. And we have to put more emphasis on building up existing capabilities. What we teach in these areas is not rocket science folks. People learn it and go oh in fact one of the more interesting comments I get is well of course this is common sense how else would you do it? Well I guarantee you there's probably a thousand people on this webinar here who are interested because it wasn't done for them in order to look at this. We have to be able to look at the idea that we're going to start out with our data practices just the same way we'd start out playing piano. Top sticks. I think I got that right. You guys probably don't want to hear me sing anymore. But you start out with a simple piece and you do that and you get good at it and then you graduate to the next level and you get good at that and you move your way on up the chain. As you're doing that it's important to pay attention to the value that you're providing in order to do that. There's probably a topic that's going to be a very great interest after data literacy craze drives down on all of this. There is an aspect of being able to say is it worthwhile? Again a critique that was put out against the original DMM was that if your organization was very large and you knew already that the scores were going to be low why pay somebody to tell you that you're bad? It just doesn't sound like a productive use. Now that's a complete oversimplification and not the actual value of the exercise. But it was definitely perceived that way when people would look at this and say oh my goodness you want how much money to tell me I suck. It's just not a value proposition that we look at from that point. So what we're going to see next is obviously we're going to get some more maturity in both of these areas and a crossover between two of them. It's perhaps in the works depends on how things go, how long we're locked down for COVID and all the rest of the things that go on to this. But at this point in time it's time at the top of the hour here to turn it back over to Shannon and see if we've got some questions. Here's our schedule of upcoming topics on here and again we do have some event pricing pieces that we can get to as well. Hi Shannon, you there? I am Peter, thank you so much as always for this great presentation. Just to answer those commonly asked questions, just a reminder I will send a follow-up email for this webinar by end of Thursday with links to the slides and links to the recording along with anything else requested. So diming in here Peter, data strategy. It includes data governance, metadata management, and data quality management. Oh goodness. So I think that's referring to a slide that I had. Let me jump to the slide. What I would say is that a data strategy should be looking at perhaps multiple pie wedges as opposed to an individual pie wedge. So a version of data strategy might exactly, this is I think what the questioner asked, include aspects of three different pie things. It may be just two. The question is you shouldn't let the artificial divisions that we've put in the dim mock pie constrain your activities around this. Your goal in developing a data strategy is to improve the utility of data in helping the organization achieve its strategic objectives. I'm saying here that it's probably a combination of pie wedges and again using the DMM maturity framework in that type of a context. It can be extraordinarily useful and it changes the granularity with which it's applied so that you're worried less about the score and more about the capabilities and the outcomes that you can derive. I think I answered the question if I didn't please ask again. Thanks, Jenner. Indeed. So it was about slide 71. Can you just show the title of the just reference book again? Oh, you mean the specials? Sure. Well, I'll be in there. Thanks. Go ahead, Jenner. Where is MDM in the DMM practice areas? So related for both DM mock and DMM, where would data clients and knowledge graphs and ontologies reside? Oh my goodness. So there's several questions. Let me start out with I think the first one, which was where does MDM reside in DMM? Correct. Yeah. Where is MDM in the DMM practice areas? So it is not as a specific practice. Sorry, I've changed slides here on you guys. The idea is that the application of a strategy of managing your reference data and your master data better than you currently perhaps are and then just statistics on this. The average organization has its customer data distributed among 13 different data collections. That may or may not be good depending on what you're attempting to do, but as far as if you don't know about where each piece is, it's very difficult to find that single source of the truth. So these pieces being in place would be necessary but insufficient in order to allow an organization to successfully implement MDM. And I would ask this question if we were in a room. How many of your organizations have implemented MDM multiple times and kind of like governance? You know, yeah, we retried it and tried it again and all the rest of that. So here's MDM up here in these advanced data practices that benefit materially from these foundational practices in there. I think I answered that question, Shannon. Oh, no, that's right. There's a second part. Where does the ontology go? Yes. So related for both, you know, DMVoc and DMM, where would data science and knowledge graphs and ontologies reside? Okay, well, again, those would be advanced data practices. We'll put them back up here at the top. Again, I could have a number of different topics that go into there. Data science, interestingly enough, was debated. I was not part of those discussions around the DMVoc on this, but the idea of where does something like that fit is very clearly that we're going to be building on top of these components as well. And it's possible that the next version of the DMVoc would, in fact, contain a slice that is devoted to data science. Interestingly, if you go out on the web, and they've been measuring this over time, the number of data engineers is wanted is much greater than the number of data scientists. And that's probably around the general understanding that data science takes about three years in an organization to even determine whether it will work or not. I heard it just the other day of an organization that was describing to me their interactions that they had. They wanted some time, so they didn't want to do a lot of things. They just wanted to prove their value right away. And it took them three years to come up with some very good use cases. But unfortunately, we're not able to demonstrate values because data science in general is seen as very abstract from all of that, that it is less productive than it could be. And finally, that organizations need to have folks working in the organization that actually have an ability to understand what's going on in the business to determine the business value of their efforts. I've heard all sorts of disconnects. I've improved them as well. Excuse me, it's a bit of a tricky mess. Now, when you get into some of the more fun things, again, the ontological work and the taxonomy, these are tools that can play a role at any part in this. It may be that you use a taxonomy to organize your tagging structure, or you may have implemented the controlled and classified information standard from the U.S. government to do your tagging of some of your data. So you can keep things separate, or that you are implementing a taxonomy so that you can implement role-based security and data security. So these components can be put in here, but as you might imagine, an organization is very low in its specific maturity or capabilities will probably have less ability to implement anything like a taxonometric concept unless they have developed specific capabilities around that. And the chances that they happen to be good with that and not good in other areas, very unlikely. Cool question. I like that. Thanks. And are the survey data results that you shared publicly accessible? Yeah, there's a couple of things. I'll add the papers that I've published at the end of this, because you're just going to watch the PDF, right, Jenna? I do, uh-huh, yeah. All right, so I will send you guys a FIPA thing that describes the FIPA paper, and I will attach the measure of the papers that have gone on here as well so you guys can take a look at them. I'm absolutely happy to share that stuff. I love it. So a big question for you here. Which of the DMBOC pie chart would you suggest working on first, maybe the top three? You know, I hate to say I need to know more information. It's hard to imagine that it's not going to involve some aspect of quality and governance, but where to start that's going to depend on where your journey is. You all have heard enough about data governance in general as a practice on the various programs that Shannon produces to understand that data governance is very personally crafted for your organization. It means things specifically to your organization if it's not going to mean through another organization. I'll just tell a brief aside on that just to illustrate this. I was involved in helping the Army get started with its data governance activities back a while, and the wonderful part about it was in the Army, virtually everything is governed, and when they found out something wasn't governed, it made our task very easy. They were able to leverage the existing organizational culture to help them quickly and easily understand the need for data governance around these activities. So again, as I said, it's probably quality and governance, but I can't tell you much more than that because I need to know more about what your organization was in order to do that. And don't listen to anybody who says they can tell you otherwise either. Thanks, Shannon. So could you elaborate on the law you mentioned about using best practices? Yes, that was the Federal Evidence Based Policy Act and it was passed in January 14th of 2018, if I recall correctly, which was during the time that the government was shut down because the Democrats were fighting with the Republicans or something and nobody was expecting anything to happen. The government was shut down across Christmas and it was in the middle of a snowstorm and January 14th this thing popped up. It was passed by 300 and some in favor of it in the House unanimously in the Senate and a lot of people were opposed to it because it kind of allows the government to do a little bit more data mining of shared situations on that. But what it says in three parts are that the all data owned by the federal government now is by definition open and I should say that's a non-sensitive classification. So anything that's not sensitive is by definition open. It's not a matter of having to declare it is done. Second is that all organizations must, all federal government agencies must establish a CDO organization that is not led by a CIO and that is not a political appointee and has met objective qualifications. That's a really wonderful type piece of legislation. The third piece is it is now against the law in the federal government to make a policy decision without first specifying the data sets on which you're going to use your decision data and also the model that you're going to use in advance a priori to the decision being made and then demonstrate those results are reproducible by outside. If you don't do that, the penalties for violating this law are higher than HIPAA penalties. Oh my goodness. Yeah, so it's going to be kind of an interesting world out there. Like Shannon, you probably have enough interest to do a whole topic on just HIPAA but I have definitely been asked to talk on that because it's a fascinating subject that's not much known about right now. Indeed. Yes, and so many times we've run into new topics. I got to keep a running list of all the new topics that we need to produce. Of all the issues, is the level of exceptional data growth the major key reason why industries haven't moved the needle? No, it's a great question and I think that maybe they're asking me that on purpose. The data volume is absolutely to be expected. Look at just simply the internet of things or 5G type things and again 5G won't cause cancer or anything like that. We've been only teaching people how to build relational databases for the better part of three generations and much of that teaching has not been perfect. So in addition to just teaching some people incorrectly, again just take a step back. We've been saying that the only part of the pie wedge that they get any idea around is sort of a combination of the data modeling and data storage piece. Most of the rest of these concepts are not addressed in college and university which means people who hire people who come from college and university think that they are unimportant and that the only thing important out of this is the emphasis on these two pie wedges over here. Just happen to have that slide up but it's a very appropriate for this question. So until we go back and reformulate the curriculum and institute widespread calls for data literacy and severe curriculum overhaul so that people come out at least knowing these things are accessible and available, we are going to continue to have this myopic perspective of well just let me build some databases and I think we'll be able to get the handle on this stuff. It's not working now and the thinking isn't thinking out of the box enough in order to do this. I think whoever asked that question I know you wanted to provoke me probably on that or maybe you disagree in which case by all means jump into the chat and let's have one. So Peter in my experience quantifying the value of providing additional funding for people resources and data strategy initiatives is difficult. Executives want to see a cost savings over the next quarter year et cetera versus the value is better quality for better decisions reduced cycle time. Any tips on how to get over this hurdle? You've got to make it explicit and there's an aspect of just one piece I'm going to grab from another webinar we did last year so that I can illustrate what I think is the problem around this. When you start out with much of the data leadership and governance activities that people do there's this sort of idea that yes data governance can do things but that's a policy move so that data will improve over time and that to many people is slow. It may take time to actually come to this fast. So they ask for more proactive measures where the data improves as a result of a specific focus around all of this. And of course this improves the feedback loops and eventually you can install a structure here that may be useful to you but what you end up with is an organization then that data things happen and that's wonderful and data people celebrate them but we haven't gotten good enough at saying something happened in data and that had a result in the business and if we don't start practicing and getting better at these results then we will never be able to justify the kinds of investment that organizations need to do this. Now of course if we get really good at this we might start to see some synergies start to go between multiple disparate efforts that produce even better results or perhaps an unexpected dividend here that actually pays off in a way that organizations can now say wow we've got some very significant savings here but all the things I have in purple on this diagram are not things we have practiced so I'll recommend you two books one of them is one of the ones I've written which is Monetize which is one of the things that on sale today hey how about that the other one is a book called How to Measure Anything by Douglas Hubbard and even though I've up to 11 books this year and it says Douglas Hubbard has sold many many many many more books than I have because it's a really good book on how to count things but just with his guidance around this if something is observable and it can be counted and if it can be counted it can be measured and if it can be measured we can come up with a cost of it. So what you see in the Monetization book is not so much how it's going to work for your organization but I put in there 17 different cases where you can look at this and say would this be useful to apply to me I don't know your organization so I don't know what it would but you're more likely to recognize something that you will see as successful in there as well. Another presentation on this. Where were we? Yes back to the questions again thank you it was a great one. I'm keeping a list. There you are. I'm a student and completely new to data analytics what are some key things to keep in mind as I go through my program when it comes to understanding and applying and practicing data management better versus what my instructors are teaching me. Learn to speak the language of the business. We call them business rules which means we're going out of some place IT and going over to ask the business people questions so why don't we know how this works? This is part of our job we understand the organization we support it from a systems perspective. We should know more about this so learn to speak business. What does a data and analytics COE inclusive of GM practices define their core pillars? What I have implemented is data quality, data data, platform, etc. I missed one word Shannon could you read it again? How does a data and analytics COE inclusive of data management practices define their core pillars? So COE is center of excellence which is one approach organizations are taking to centralizing and rapidly developing their internal expertise the capabilities that I've been calling for on this and the idea around the COE is that most organizations look at this from a cost perspective. Nobody looks at your HR department from a cost perspective. Nobody says you know I think we've hired our last person. I think everybody is going to behave themselves from this point onwards you know we'll be just fine. See you guys later in HR we're going to redeploy you into the data and analytics world because clearly we don't have enough people in that world. That's going to be a challenge for organizations to understand that that is the same type of longevity that your data program will need. You will no longer need your data program when you no longer need your HR program. Now what size it is, how big it's going to be, what expense it's going to be, the mixture of consultants and students and it's going to vary widely across a number of different factors and also different levels of maturity as your organization goes through. You may employ some additional labor and resources to clean up to eliminate your data deficit but once you've cleaned your data deficit your existing staff may be able to have it but rest assured there's going to be data things that need to be done with our organizational data asset that it's the only asset that we have that doesn't deplete, that doesn't degrade over time and is durable in nature. Doug Lane likes to say also that it's very difficult to spill, very difficult to clean up once you spill it which is another great line around all of that. How do you control the disruption caused by modern data techniques or technologies that usually focus on the consumption and use case delivery? I'm not sure I have a good answer for that but if technology is disrupting your organization in a bad way, one should question what are the goals of all of this. Disruption by itself is not necessarily bad. There's an entire study of management around disruption. Rudolf Schumpeter I think was the PhD that focused things in that area but if technology is disrupting things in a bad way that's probably something that you need to back off on and see exactly what's going on. Many organizations have kinds of challenges when technology changes and that you bring in new and exciting capabilities but at the same time 85% of data lakes are complete and out of their failure according to Gardner. After data lakes are bad it's just that we haven't quite figured out how to use them in a very productive way around all of that. I'm not sure how much more guidance to give you without some more questions but we'll probably run out of time for that so I'll just pick up the phone and call. Rachel to the earlier question Peter that was asked of all the issues is the level of exponential data growth. The major key reason of why industries haven't moved a needle. The questioner did add in not meaning to provoke you. I enjoyed working with you and there are specific resources available that show monetary metrics per industry for baselining our benchmarking data management and the universal index. Yeah nothing that I've seen this is all very much nacy of efforts. The goal is and I can assure you that all of the efforts I've done around monetizing are coming up with not a total cost of ownership which is really the approach we should be taking. Excuse me but instead this is a minimum that we can absolutely test for. I'll give you a very quick example of it. There was a data conversion project that was going on between one organization and another and they were sort of haggling about the process and they wanted to put a measure on how much time it was going to take to correct a data error because there were quote thousands of errors. Okay well if each error takes five minutes to complete and you're paying people $15 an hour there's a dollar cost. Does it represent the true cost? No absolutely not but it's probably enough that at least somebody's going to pay attention and say whoa especially when you tell them nobody ever affects the data quality problem in five minutes. So if you double that number that somebody already went whoa you've got some real numbers that you can start to put together. You spoke of the theory of constraints as an improvement method. How do you feel about lean and six sigma? Very good and I use the theory of constraints because the book that narrates and describes it provides such a gentle introduction to all of these topics that is actually quite useful around that. And no, doing a lean six sigma or any other type of agile oriented process is absolutely as good as this. I generally don't advise people to do that because those programs were introduced as silver bullets. So if you go into most corporations and say hey I think we should try lean again you know you go oh man we tried that two years ago it didn't work. Again not through the methods faults but in this case through the application that makes it really really problematic. So I just found this one an easy way to describe lots of people remember the story and at least it made sense to them. There's a part of there that there's a Boy Scout hike. A Cub Scout hike and it's one kid Ralphie can't keep up right? That's just an indelible scene in most people's mind that says you know you're whatever the slowest thing is that's going to be there it's going to be you can only make progress at that speed. Ralphie say constraint. And I think we have time for a couple more questions here. The business always wants to see data management effort impacts as soon as possible. So how to balance the quick and the right way in this process? You are exactly using the right word. It's a balance. If you have too much on the side of building capabilities and in five years everything's going to be fine you look like a science project. But if you deliver nothing but results without improving your organization's ability to deliver those results repeatedly you also are cutting yourself short. So it is a balance and you are the best guide to understanding what that balance should and needs to be for the organization at whatever point in time it is at. It is absolutely a balancing act and it's got to be done very carefully. And could you elaborate or give an example of learning to speak the language of the business? Yes when organizations talk about data scientists they critique them by saying that they spent all their time doing something that wouldn't earn us very much money. Which is not a very nice thing to say about your data sciences. But at the same time if the data sciences had known a little bit or showed an interest in the business then there might have actually been a little bit more careful focus because you could say things like well it didn't work over here but if you tweak this just a little bit and know that we sold bicycles what you are trying to do would have actually made a lot of sense and would have worked. But because they didn't even know the companies sold bicycles there was no opportunity to take advantage of anything along those lines. I'll give you one more example real quick. A data scientist, sorry Shannon, a data scientist told this story recently I thought it was a great story. She went running into your CEO to say hey we got an 86% match on something and the CEO turned around and was red faced and said we only do things at 100% at this organization here. Well that's a miscommunication that was a wonderful learning opportunity but it was a little bit scary being under the glare of the headlights there. There was a person who clearly hadn't quite figured out what that language was. I think communication class requirement would help a lot of degrees. So we have I think that's all the questions we have for right now looking through. Likewise thank you Peter for this great presentation and just to remind everybody I will get a follow-up email by end of day Thursday for this webinar with links to the slides and links to the recordings. And a couple extra papers. Yes indeed keep an eye out for those on the slides. Thanks everyone for all the engagement and again Peter thank you as always and I hope everybody has a great day and stay safe out there. Take care. Thanks Anna. Thanks.