 Hello and welcome. My name is Shannon Kemp and I'm the Chief Digital Officer for Data Diversity. We'd like to thank you for joining today's Data Diversity webinar where data architecture and data governance collide. Sponsored today by RELTO. It is the latest installment in a monthly series called Data Ed Online with Dr. Peter Akin. Just a couple of points to get us started. Due to the large number of people that attend these sessions, you will be muted during the webinar. For questions, we will be collecting them by the Q&A section. Or if you'd like to tweet, we encourage you to share your questions via Twitter using hashtag data ed. And if you'd like to chat with us and with each other, we certainly encourage you to do so. And to open an access either to the Q&A or the chat panels, you may find those icons in the bottom middle of your screen. And just to note, the Zoom chat defaults descended just the panelists, but you may absolutely change that to network with everyone. To answer the most commonly asked questions, as always, we will send a follow-up email to all registrants within two business days containing links to the slides. And yes, we are recording and will likewise in a link of the recording of this session, as well as any additional information requested throughout the webinar. Now, let me turn it over to Chris Furbrie, forward from our sponsor, Reltio. Chris, hello and welcome. Well, hello. Thank you. And thank you for having us today. We're at Reltio. We're probably sponsoring today's event. My name is Chris Islaw. Let me share a couple slides here with you today just to get us kicked off here. Hopefully, everybody can see the screen loading here. At Reltio, we believe that MDM is a huge part of data governance, a big part of this topic. Sorry for the flipping around. My magic mouse does magic that I didn't want it to do. We all know that there's a ton of data out there already, right? And with the evolution of the cloud and everybody's moving to the cloud, digital transformation is everywhere. We see an average of about 450 apps per organization out there in the world, not only growing, especially as cloud solutions, cloud applications offer an opportunity to invest in best of greed, more niche type of business focused applications. And we see that creating a lot more data complexity, a lot more siloed and fragmented data with the advent of cloud and everybody moving to that. When we don't manage, we don't govern that data. And I know Peter's really going to get on a lot of this topic with some great information a little more thoroughly today. But things we see here at Reltio, our data issue identification is really inconsistent and lacks visibility, which makes it really hard to actually execute on your data governance strategy. We see when you don't do that, obviously, low quality data impacts the business in multiple aspects, both operationally and analytically, right? When you're not able to create the right insights, you make bad decisions, right? Data issues, the resolution of that tends to be very irregular and highly manual as well. We see folks, if you are at a more mature phase in your data governance, you've got application teams that are trying to govern data from different perspectives than you would from kind of the analytics in the old warehousing and data management type teams. People tend to be in there and more broad perspectives rather than in application driven perspectives where we can trace and we can see and we can align to very specific roles and responsibilities around data governance and stewardship. We believe at Reltio that modern MDM, so cloud-based MDM, which allows us to actually scale and provide a lot more flexibility into the solution set than in past legacy kind of on-prem and enterprise type software, automation of data unification, continuous curation, meaning in real-time as data is flowing through the supply chain, being able to identify issues, being able to report on data quality, being able to have that visibility across functions within your organization, and then being able to work it through a workflow that can be tracked and monitored as well as having access to full audit logs so you can see what's happening with our data, what are we doing with it, are we affecting it the way that we intended to, and also having the right permission sets, making sure that you can invoke and protect with the least minimum access possible for all folks across the organization so they have the access to the data but that you're not unnecessarily exposing it in bad ways. So with MDM, again at Reltio we believe that cleansing, de-gupping, enriching data, all the things that MDM can do for you, all of our core capabilities are on data quality, support the structure and the architecture from a people in process and technology perspective of your data governance strategy and helps bring it into action and making it real. So with that, my time is just about up again. We're Reltio, we're really, really proud to be a sponsor here today. Hope to see you out there. If you have any questions please just go check us out at Reltio.com. I'll be here at the end for any other questions that you may have about how we feel about MDM's participation in data governance. But I'm going to send it back to Shannon here so we can get started and hear a lot of Peter's great topics and information and we'll see you guys at the end. Thank you for showing up today. Shannon? Yes, thank you so much and thanks to Reltio for sponsoring today's webinar and helping to make these webinars happen. Again, as Chris mentioned, if you have any questions for him or likewise he'll be joining us for the Q&A portion at the end so feel free to submit your questions in the Q&A. Now let me introduce you, our speaker for the webinar series, Dr. Peter Akin. Peter is an acknowledged data management authority and associate professor at Virginia Commonwealth University, president of Damon International and associate director of the MIT International Society for Chief Data Officers. For more than 40 years, Peter has learned from working with hundreds of data management practices in more than 30 countries, including some of the world's most important. Among his 12 books are the first making the case for data leadership CDOs, the first focusing on data monetization and on modern strategic data thinking and the first to objectively specify what it means to be data literate. International recognition has resulted from these and an intensive worldwide event schedule. Peter also hosts the longest running data management webinar series from data diversity and he does. Starting before Google, before data was big and before data science, Peter has founded several organizations that have helped more than 200 organizations leverage data specific savings have been measured at more than 1.5 billion US dollars. His latest is anything awesome. I mean, with that, let me turn it over to Peter to get his presentation started. Hello and welcome. Hi, Shannon. I should pay you by the dollar for all those wonderful things you say about me. It's fabulous to hear that and maybe we can shorten it just a touch to get a faster into the material. But yeah, good afternoon everybody. Lovely day where I am in central Virginia. Hopefully it's nice where you are as well. And you can see I've of course altered the title just slightly to say that strategy is where data architecture and data governance collide. Let's sort of see how we get there, how we're going to spend our time. First of all, as usual, it's important to have good stories, good anecdotes, things, arguments that you can use to help people understand some of the aspects of data that are just generally not well understood. Then I'll define data governance briefly and then also define data architecture briefly. More importantly, how working the two of them together, particularly if you look for opportunities to work with them together, they will present themselves. And that gives you the ultimate of, if you will, meta data reuse. We'll get to our takeaways and Q&A a little bit from now and invite Chris back on. Look forward to having him join us for the Q&A section for all of this. But let's jump in and get started on talking about what's actually going on. So if we start with data as today's most poorly managed, but powerful and underutilized organizational data asset, we can pull some things together fairly quickly. Data is relatively, at least as far as most people are concerned, unlimited. It's not really terribly visible, absent some competent visualization expertise. It's cheap to transport and move. It's free to make copies, as Doug Laney says, it's non-rival risk. And of course, he also says it's impossible to clean up if you spill it. These make it a very unique proposition, especially in terms of assets. And yet when I tell organizations that 80% of their data is redundant, obsolete or trivial, the only argument I've gotten in 35 years is, well, for us, it's probably 85% or maybe even 90%. That's telling us something. And of course, when you add in the fact that it's of unknown quality, it becomes even more problematic around this. And data specific education just hasn't been there. And what has been there has, quite frankly, been infrequently and inconsistently taught. And most of all, it's missing a business context. So there's no sense of the value of the activities as part of a larger system, which means that data, fortunately or not, has had to be learned by every work group. In fact, the ability to exchange data is part of the definition of a work group in this. And I'm going to show one of my favorite little bits, which is just imagine if you had told people to go off and learn to play the piano and one had come back doing this. Now, of course, the problem is that this individual has in fact done what was asked, which is to learn to play the piano. But if our goal was speed or our goal was quality, perhaps we would have had additional guidance that could have helped this individual find where he was going to go. As he is, of course, wonderfully talented. And this is no confusion to many of you because we've seen this over time. IT has thought that data is a business problem. And if they can connect to the server, my job is done. If the business thinks IT is managing it, after all, what wealth would the CIO of our organization be doing? And of course, many, many things is the answer, which means the data has fallen into an enormous chasm between the business and IT. And collectively, we all have to work together to repair the damage. But data debt, unfortunately, isn't terribly easy to visualize there. It slows things down. It decreases quality. It increases costs. It presents generally greater risks around all of this. And if we don't proactively address it, it will continue to mount. My favorite current example is an article from Forbes from a couple of years back just at the height of the pandemic when the airlines weren't thought to be worth very much money. The article made the statement that American Airlines market value at $6 billion was somehow undercut by this $20 to $30 billion that their advantage, that's the frequent flyer program that they had data was valued at. And I attribute both of these. You can see a tremendous difference for the United Airline mileage plus data as well. This is because I believe many organizations have accumulated data debt that prevents them from monetizing these. You can be sure that management would love to unlock this data value as quickly as they can. And yet when we look overall, how are we doing around the world, a great set of surveys comes from our colleague Randy Bean, who every year produces a new version of the same thing so that we can see over time how things are. Now, just from the 2021 survey, most organizations are not driving innovation with data. They're not competing on, this is self-reporting. So we can hopefully they're accurate around all of this. And then in the right hand corner with the vertical bar charts there, it starts in 2018. The question was, are your data problems primarily technology or are they people at process problems? And in 2018, you can see it was 80, 20. Over the years it has varied, but still nevertheless, proportionately remained the same. Largely, what we're dealing with in the data world are people and process problems. And data governance is in fact what we need to do to address that. We'll get into that in just a bit. But here, first of all, I hope I'm not showing you for the first time is the data management body of knowledge or labeled as the DIMBOK. And governance you can see is key to it, foundational right at the center, providing the connective tissue, if you will. And then we can have architecture here. These are the two components. What we were attempting to show in this was the areas that encompass data management. Unfortunately, many people took away the idea that you must do these things or you aren't doing data management. So you can see a little question of cardinality would have been helpful there. Maybe we can get some help around that area to articulate it the next time through. Again, governance and architecture, what we're talking about today, and that's a good combination that I will relate to you on this. But they're even better, of course, in conjunction with each other. The key is, in fact, in most contexts, we need a three-legged stool rather than a two-legged stool. And so we add in one other piece, again, BI and warehousing. Or if I shifted it slightly, for Chris's benefit, we could do a reference and master data down there. But that this would be the elements, if you will, of our data strategy in here. And I wanted you to see this for the grounding purposes to know that many people are using this now as a way of communicating about these data topics. And consequently, it's sort of serving as a de facto standard around this. So let's press onward a little bit and define data governance. And the first thing to think of is that while we may enjoy talking about data outside of our circle, most people generally do not. And so you need to have some version of a pitch that you can work with them. You need to make them aware, at least at a rough idea of where the organizational debt or what areas of risks are associated with it. And then looking at the process as getting good with some skill sets as opposed to with a tool set on this. So that's our next section we dive into. Data is not broadly or widely understood outside of our circles or perhaps even within it. It's kind of the blind persons in the elephant where different parts of the elephant appear differently to different people at different points in time. And data, of course, comes across the same way. There's many different perspectives that people have on it. None of them are necessarily wrong. But having an incomplete picture does limit our abilities around this. One of the most egregious is what Tom Redmond detailed in a wonderful Harvard business article that I have at the bottom of the screen there that talks about hidden data factories. And the idea is that department B, rather than fixing department A, actually just says, well, we'll just suck it up and do the work. And that's what counts as a hidden data factory. Similarly, department C has one of these in its work process. It's just like, yeah, we always have to fix the widgets from B because they never get them right. Similarly, if customer information comes back, there are three examples of what Tom called hidden data factories. And his cost on this was $3 trillion a year on the US economy. So as long as we're tossing trillions about, the idea, of course, is that these hidden data factories are pernicious. And we need to find and get them where we can. The hope for us is that when we look at data problems, they tend to manifest as multi-faceted challenges. In other words, almost no matter what we're doing, the people who are complaining about the thing that data is causing to be a problem are saying, something's not right, and they're filtering whatever they're doing through some IT system or a business process. There's nothing wrong with that. But only when you connect the dots are you able to see that these poor results have a singular cause. And this root analysis is one of the things that data governance needs to be able to do on a repetitive basis. In fact, the best approach is to develop a set of skills so that you can dive in with repeatable technology skill sets, as well as organizational skill sets. Now, I say that to get you on the idea that data has a common root cause is a part of every business challenge. And around all of this, we have had different forms of governance at corporate level. There's a corporate governance that needs to occur. On the IT level, it's an IT governance. Here are seven definitions of data governance. And they're all done by friends and colleagues. And I understand the intent of what they're doing. But what I would ask you to think about is the elevator pitch. And I'll walk you through that in just a second and say, and I'll pick on our own dimbock, which is the last definition down there. Somebody looks over at me and says, so Peter, I hear you're doing data governance. Can you tell me what it is? And if I say the exercise of authority and control over the management of data assets, which is a valid and correct definition, I probably won't get some engagement from the individual. See, the key with all of this is that management's attention is relatively short. And so when you get the opportunity or they ask for the opportunity to speak with you briefly on a topic, you have to have the ability to express it concisely around that, or you won't get asked back in for the next picture. So from an elevator pitch perspective, data governance should be defined as data that is managed with guidance or managing data with guidance, which immediately somebody might ask the question, oh, so we weren't managing our data with guidance before. I see that could be a problem. Now, as you work your way up the chain in organizations, I'll add a word to it, which is data decisions with guidance, because management oftentimes does not realize the impact of their decisions on data as an asset for the organization. This data governance process or lack of it can cost organizations millions in productivity, redundant siloed efforts, hardware and software that should have been more aligned with the organization. If you discover after you've purchased something that it doesn't complement your existing data structures, you will incur larger cost of goods sold, delayed decision making using inadequate decision making on here, reactive instead of proactive type initiatives. And a good statistic in here, 20 to 40% of all IT spending is something that can be impacted by better data governance around all of this. The poorly thought out and bad decision making is evident in this next example here. It's called the bad data decision spiral. And it's probably no surprise to any of you that organizations spend a lot of time and effort correcting bad data decisions that they make. And it leads them on this sort of a death spiral, if you will, where business decision makers and technical decision makers not being data knowledgeable make bad data decisions. Those bad data decisions resulted in poor organizational asset treatment, poor quality of data and poor organizational outcomes. And of course, the question is, thank you Morgan Freeman. I can quote him three times here in the process, it is wrong. It shouldn't be the thing that we do. And the most common example of this that I've seen over the last 24 months is organizations implementing very fine piece of software, Salesforce.com, but making the technology availability date dependent on a IT driven schedule as opposed to a data driven schedule. And unfortunately, users cannot tell the difference between bad data in Salesforce and Salesforce. So all they come away with is Salesforce isn't working the way it should. This is a result of a bad data decision where somebody decided to optimize schedule over functionality around this or perhaps over quality depending on what goes on. There are a number of slides up here that are now reference slides going forward. And the first one is a set of goals and principles that you can adopt. These are straight out of the dim box in here. I'm not going to read them to you. In fact, I'm going to briefly run through each of them here, roles and responsibilities, deliverables. There we go. Scorecard of some sort, a checklist that you can put together, various components. And it does seem like a lot and it kind of seems overwhelming at first. And what I want you to take away from this is more a sense of rather than trying to accomplish all of this well and perfectly the first time that you do it, instead evolve your solution over time. Learn to adapt the things that are going on with it. Think about it for just a minute. As you start with data governance, these might be some of the things that you need to put together around this. I had one company call me up and say, I need somebody that can write 27 policies. Oh my goodness. These are terrific ideas, but they're going to occur once the very first time that you do it. And instead, what you're going to quickly move to is a planning exercise. And if you have a choice of investing your time and effort, rather than getting better at something that only occurs one time, get better at the repeatable process. But if you haven't seen it already, it pops up very quickly to say plan, do, check, act. It's a wonderful paradigm for thinking about how we're going to use data governance to help focus on the thing that really needs to happen, which is better use of our organizational data in support of the organizational strategy. Now, as we dive on a little further, one of the things that we have to be careful of is something that is properly labeled by Hans Christian Andersen, the princess on the P. You'll notice the P is very much down here at the bottom and the princess is discomforted up here at the top. And the reason that's important is because if you do a poor job with data governance, you will have both failure to understand proposed and existing services. Any imperfections in the data structures are locked in for the life of the app and must be either corrected with application or stored procedures, other things along those lines. It really limits the kind of benefits. Again, it's as simple a question as might we do business internationally. If so, we'd better build that into the system architecturally instead of trying to retrofit it afterwards. It decreases the amount of leverage that you can get out of your data. 40% of IT budgets migrating, converting and improving data around all of this. And the problem again with data governance similarly, it just like data debt takes longer, costs more, delivers less and presents greater risk in moving forward with all of these activities. So let's define very briefly now data architecture. When I'm sorry, I went very much too fast back up the area. It is ubiquitous, but not at the same time well understood. It helps to keep things focused primarily on strategy, which is the main takeaway from all of this. And once you realize you can't use what you do not understand or can't even find, it really does tell you what the particular value proposition that you can use. So architecture as we're introducing this is about things and the function of those things, whatever they happen to do individually and how these things interact either as a system, maybe even towards a goal with all of that. So you can notice the different types of architectures that can be managed by businesses. I won't walk you through each of them, but I will say that the main thing that happens from management's perspective is that you have a bunch of people in a bunch of meetings doing committees. If one in 10 of these manage one of the 10 organizations manage one or more of these formally on this, you've got to make sure there is a value proposition around this and that people understand what you're getting out of it. I've worked with organizations that have had architecture taxes. I've had different types of constraints that play in with this. Again, let's just be very clear. Your organization to be functioning has a data architecture. All organizations have data architectures, but if you don't understand them, if they are not documented, they can't be useful to the organization and that applies to all the other types of architectures that are listed down below there as well. Again, if we don't want to do business anywhere in the rest of the world and by all means leave off your ability to add in new concepts around what's going on with respect to multi-currency, multi-lingual types of things, but you'll see the main lesson from this which is by the way a BMW commercial I found out there, but you can't architect after your implementation. This is one of the things that architecture does is that it constrains and says, gosh, it'd be great if we could build from the top down, but we can't and therefore won't happen on this. If I was the architect of the pyramid and the my boss said, look, I told you to build a swimming pool in the basement. No matter what I tried, the pyramids are built of large rocks on top of shifting sand and so the ability to retroactively go in and put a swimming pool in the basement of the pyramid is definitely not a thing that's going to happen in any time soon. That would have been an initial requirement, but not one we can retrofit afterwards. So how do we organize various components into architectures? And the first piece of it is to look at the details that are organized into larger components. And this gets us into some things that are generally pretty intricate. And so from that perspective, it's worth keeping that adjective in mind around this. The larger components are then organized into models, and these models introduce dependencies. One of these must occur for each many of these, or one for one. Again, there are a few simple rules that we use in data, but we nevertheless lock in a great deal of information around these data structures that we put together. And finally, we pull them all the way together into architectures to define them around a certain component, a certain activity, a certain purposefulness that we're attempting to use moving forward around this. And just the same way, if I go around and say, how do the data structures work? Well, we have attributes that handle the intricacies that we're talking about. We have the intricacies, excuse me, the entities that handle the dependencies around this. And we have examples of both of the first two. One of them is sitting right there, thing, and then the thing ID, thing description, status, the text to be assigned, and the reservation reason, all of those are characteristics that relate to the intricacy of the system. We can add even more detailed amounts onto it. Sometimes it's useful, sometimes it's not. The question, of course, is what are you attempting to express in terms of the requirements that you're putting together? The next one down, the model there, it just shows two entities connected to each other. They're using what's called standard crow feet notation, which you can look at and see zero to one. And we've used a little bit of information engineering. And there's zero or many, which is kind of an ultimate sort of thing, a time-based constraint on the relationships. Let's say that for a more advanced class, but you get the picture of it. Finally, of course, we can put all of these into models. And the models, again, should be expressed as an architecture with a purposefulness. That purposefulness is something that says, we're doing this because, and again, it's either in support of the mission or the money, depending on whether you are a private sector or government around all of these. Why don't we have any examples of this? Well, it's really hard to kind of get into one of these things. Here's a typical architecture that could be something that we use. But again, it would take us a while to figure it out and get through it. So it's a lot of complexity. You have to develop a lot of time. And you have to see enough value that you'd get on the other side of it. So I'm not the pooling enterprise architecture at the highest level with these enterprise models, but that oftentimes they're not providing the value that organizations want to have from them. And the organizations don't kind of know how to make use of them in order to do that. One of the more fundamental reasons for all of that is because the basic understanding of what data is is really ununderstood. So I'll start out with the number 42 as just sort of a random fact. And we'll say that number 42 can stand for a couple of different meanings in here. We might say in the first instance that it's a Jackie Robinson's Jersey. It might be the life universe and everything, or it might be my age 21 years ago. Each of those are three different meanings. They could have different perspectives on them. Our first challenge looking at data is to filter out useful data and try to not to look at all the data or will become overwhelmed in the process. Then, of course, we can differentiate between data and information by the fact that there has been a request for that information. And so that will help to definitively tell what's going on there. It also lends credence to the statement that you could have data without information, but you can't have information without data. So we need to be in there at a fundamental definition level. Finally, how that information is used is where we start to get business intelligence and the things that go into warehouse and other things like that. That in and of itself as a model is an important data architecture. And just to understand the meta metadata on that really is important to get that part down. So feel free to take that as based on Dan Appleton's definition from a number of years back, but we've used it for a long time, very successfully, and it answers lots and lots of questions. So the next question that you might have is how are data structures organized to support strategy? And the first thing to consider is what's about the opposite question. Is it absolutely certain that your systems were explicitly designed to be integrated or otherwise work together? The answer is probably no. And if not, then what's the likelihood that they will just happen to work together? And again, the answer is no. So there's a lot of optimizing. And most importantly, none of them can be helpful as long as their structures are unknown. So let's take two different approaches in here. One of the approaches might be that we're in a restaurant now. And the restaurant is showing the dishes in the upper right hand corner of your screen there. They're just different dish type for each type of dish. There's an apple pie dish separate from a peach pie dish. You get the picture, just a lot of them. If I'm trying to achieve efficiency and effectiveness, that's going to require a lot more logistics, a lot more back supply of plates and things like that. Then if I instead go for rapid implementation, in which case we just grab the next plate off the top of the stack in order to implement the dessert and make the customer happy in terms of the customer delight experience. So you get the picture on here. You can architect these systems to be put up in the right structures. And this is an important constraint on how the rest of the system will be able to be used going forward. And the last part of this is we don't want to get sand in the gears. And this is another unfortunate role the data can play where you drop sand into these gears and they come up the works. And it really makes for a terrible tasting data sandwich. Yes, it's around lunchtime here. So we'll talk about sandwiches for a quick second. And I bring this up because it's important to be able to articulate and understand how the role of high performance automation plays in this. Apple has a 16% of their business that they do 100% automated. Now that's a pretty large number around that. And organizations that are trying to do that are hindered by the sand, if you will, in there with round data literacy and uneven data supply and uneven use of data standards. And the more that you can focus in and try to make these things work together in a real engineering focused environment to be able to pull them together and to make them function the way they'd like to have, you can't do that without an investment in architecture. And I had to go all the way to a tea farm in India to see a wonderful Deming quote on the back of the cash register quality engineering architecture product work products do not happen accidentally. Of course, if I add the word data in there as I'm always wanting to do, you get a very, very big piece that comes out along those lines that says, hey, you can do quality engineering in order to do this quality architecture without having the ability to actually have the things in place that you need to have. So quality data architecture work is an important part for high speed automation. And we've got to be able to have that focus in there as a goal, because if we're not trying to focus on it, it simply won't happen by itself. One of the ways we protect against those bad practices, this is actually the location I'm speaking to you today from my home in Montalier here. And this is a picture a couple of years ago of where our barn was getting ready to be. And you might ask the question, why am I showing you guys a bunch of pictures of my barn? Well, it's to illustrate a point. And part of it was that I borrowed money from the bank, and the bank gave me exactly as much money to build the barn foundation, no more nor less. And they said, go and get a barn inspection before further construction can proceed. Now, being what's called a horse husband, this means that if there was ever a problem with the horses, I would always take care of the horses to the exclusion of paying off the bank loan. Please don't list that bank in this. And it's a little bit more difficult, but the bank wants to make sure you've got a good foundation. And so before further construction can proceed, you get a foundation inspection from the county. And this is to document the passing of that foundation inspection. By the way, it makes good business sense. Does it not? The bank would prefer to get paid back. So they want a good barn on top of a good foundation. And therefore they won't worry about the payment getting back. It'll be exactly what they want. But there is no IT equivalent to this. And this is a real challenge for us all the way around. So a couple quick takeaways of this architecture thing. What is the data architecture? Well, the first thing is that it's a structure of database, not database, but data based assets that support the implementation strategy. And most organizations have these assets that are not supportive of the strategies. That means their information architecture, the data architecture cannot be helpful. So the question becomes, how do we make better use of the things that we have? And this use question is the application of these data assets towards strategic objectives. That's the purpose of the data architecture. These are going to be relatively accessible by what are now standard a number of different ways of assessing maturity of the practices. But that increasing these capabilities as an organization in data architecture allows an increased capabilities, dexterity, and self-awareness. And that's an important component to know what you're able to do as opposed to just what somebody says you ought to be able to do. And this is done through the use of data-centric development practices. Again, topic of another lecture. But the co-own about all this is that continuous redevelopment, the starting point, isn't really the beginning. And that's what you're going to encounter. The only time you're going to encounter an organization without a data architecture is an organization that's in startup mode. Everybody else has one. It's a question of just uncovering it, understanding it, iteration by iteration, focusing on one component at a time. And there's a formal transformation process that we won't get into in this type of a context. So we've gone through governance and architecture. How do we get them to work together? And the key is, of course, to focus on strategy, to upend the traditional approaches to these kind of things and think of things along the lines of something that I fondly remember from my youth called defensive driving. And we'll finish off with a little bit of storytelling going into this. So first of all, let's define strategy. And the idea is that we didn't use this as a term much at all before the year 1950 when a bunch of management consultants discovered and said, ah, it's a great idea for a big plan. And we can have this master sort of thing and put together 100 page documents and 1000 point PowerPoint presentations. And yes, I have seen 1000 point, excuse me, 1000 page PowerPoint briefings on strategy. And this is, of course, when strategy becomes a thing. Instead, we should go back to the original military definition, which was a pattern in a stream of decisions clearly indicating much more of a process focus, a capability focus, as opposed to a thing that gets put on the shelf, a little bit of guidance from my data strategy. But what is a data strategy is the highest level of guidance available that focuses on those data activities that will impact business goal achievement. And that can only be done through a data governance organization that has tightly integrated with the strategic planning process. It provides guidance when faced with uncertainties and allows sort of a true north, if that's what we're all headed towards in order to do this, typically involving a balance of remediation and proactive type of measures. The wrong way to think about this data strategy is the way most organizations have traditionally done it through no fault of their own. But we had organizations, we had it, the it strategy, of course, supports the organizational strategy, and the data being a part of IT should support the IT strategy. The reason that doesn't work, I think we're confirming the reason it doesn't work is because IT correctly chops things up into projects. And projects are not going to be helpful to us if we're trying to implement a program around this. So to move these around correctly, the IT strategy and the data strategy, and I do say that the data strategy outweighs what IT is doing. After all, IT is just delivering our data to various business processes and practices that are around in order to do this. There of course is a pushback and should be a cooperation all the way around. When we look again, it's the idea that data and what data support can we provide that will help the organization better achieve its strategy. Then we take that strategy, that data strategy, and push it into data governance and say, what can data assets do to help better support the strategy? That's important for a couple of reasons. One, data governance, you're generally completely ungoverned when you start. So the question is, where do you start? Not with the A's. So use this type of an approach here to say, how well is this stuff working? Are we able to focus in on the things that we're trying to do? And how, of course, do we implement these? We go through a series of stewards around this. And the stewards have to be taught of these business goals and the metadata as the language, the controlled vocabulary that you use in order to do this. And of course, this only works if you have a trusted catalog in the process. If you haven't got the ability to put this down, someplace where everybody else can get to it, it's a real problem. Now, hang on for these next couple of images here because if you're not familiar with conceptual, logical, physical, as is validated and unvalidated and 2B models, it might be a bit confusing, but I've tried to present it clearly. And remember, we are recording this. So let's just start off with a three-dimensional model. That's the warning. And what you're seeing there is everything that is as is. The icon that's to the left and the red is a toolbox. Here is 2B. The difference, of course, is that as is is what I currently have. And 2B is what I'd like to have, how I would get to that in the future in order to move forward. Okay, so that's one way of defining this model evolution framework for data architectures. This is how we're going to come about and figure out what parts governing and what parts architecture can we add to this nice recipe here that will give us the results that we're attempting to get given this valid. Now, the next part of this is the same model can also be chopped up and looked at this way. All models, all model components are either validated or unvalidated. Some people prefer the word draft. I like the crossed fingers and, you know, kind of hoping that it's going to be the right kind of thing there. But that's another way where we start off with as is and 2B. Now we've got validated and unvalidated. And yes, I'm going to add that third dimension, which is conceptual, logical or physical models. And the reason I'm showing you this is because everything that we do inside of data architecture follows a path of going from something conceptual in nature that's typically in a binder of some sort, it's a requirement of some sort. And then we go to a design. This is the logical piece that we put in place. And this logical design is a model of some sort. Finally, we go to a physical design, which is the implementation of the system. These are rows on the Zachman framework, if you want to think about it that way. Every change in your data architecture can be mapped to some subsection of this larger model, this three prong model that I have shown here. Very important to keep that in mind because if people can't express it in these terms, you haven't done a good enough job with your understanding of the problem to be of use to them as a data person around this. Again, we can hit that at the Q&A if that's not entirely clear. Let's just look at another challenge that we're facing as a community in general. It's just worth putting out just to understand this, which is that the only thing that we teach them in the university is only valid requirements. That's really nice. And we only teach them about forward engineering, which is building new stuff. And unfortunately, that is only 20 percent of the effort and funding that we spend and that we spend 80 percent of our time reverse engineering. This is a different process. This is going backwards from our existing environment to understand this. You can see the definition is down below. My friend Elliot Tchaikovsky contributed to that. It's the idea that there are structured techniques that give us rigorous knowledge of the existing system in order to enhance, leverage enhancement efforts going forward. This is the idea of going from the as is to the logical and from the logical to the conceptual if it's necessary in order to do this. We do that, of course, to understand its strengths and weaknesses. After all, if it has good things that it does, we should incorporate that in the re-implementation down below. And if it has bad things that it does, we should make sure it doesn't do that. But it doesn't help to reverse engineer it if we don't also use that information going forward. And only when we have informed the design of the new system, can you call it the re-engineering. And most of the digitization is failing around these initiatives not being aligned correctly. Again, let me take you into the concept of this. This is the idea that an organization may start out by doing data governance and say, well, I don't have much feedback. So I don't know much about what's going on. But I know that I'm supposed to do data governance because our regulator told us or our consultant or whatever it is. And what is that that's happening? Well, people get a picture that that improves data over time. Nobody expects you to start on Monday and have tangible results by Friday. Nevertheless, people feel that this is often slow and particularly slow given the kinds of investment that they're estimate. So other organizations are also including in their data governance roles, these data improvement projects, where data improves as a result of a focus. You can see that is perceived as a faster pathway in order to do this. This allows us to collect better data about what's going on, put in an infrastructure in place, and finally get to the point where something good happens with data. And that's wonderful. Unfortunately, too many efforts stop here. The sign I'm using here is approximately equals because we aren't and haven't been good at doing this, but we should get better. And when something good happens with the data, let's translate that immediately and practice translating it into monetary terms as quickly as we can because only when we understand this component will people understand that the data architecture component make the sizing of these projects much, much easier. Data architecture is what tells you how big a piece can I change without starting to impact other types of things in the organization. I was talking to a group this morning that had a system where it had 100 connections to it. And they said, oh, we're arguing about whether it's going to take one year or three years to move all these connections. And of course they knew that much. They knew the name of the interfaces. And I said, nobody can tell you anything at all. It's a very difficult challenge. These architecture components help you make things concrete. Another way to think about it is using a lighthouse metaphor, which is that we've already talked and said that data strategy is the idea of using data to support the organizational strategy. So there's a limited number of things that will most help us there. Let's intersect that loop with the use by data of the business. In other words, what is something that the business would benefit from a more clear understanding? Even if it's just defining existing terms. I've got lots and lots of these charts around that show how this insurance company defines these four key entities in the system. And everybody gets on the same sheet of paper very, very quickly. Finally, ideally, if we've got the ability to pull three things together, let's find out what data skills the team needs in order to do this. And this is what helps identify the sweet spot. Again, constrained by an architectural component that will help us understand this. See, there's basically four types of participants in this. Notice it's all built on an IT systems development process. We have to have that as a baseline or nothing else works around this necessary, but in sufficient conditions of course. And that if we typically circle the left-hand side of this, where many organizations say this is what I'm going to call my data governance community here. But what is supposed to do? Well, leadership, of course, are supposed to get resources around this. They're going to listen and get some feedback. They're going to make some decisions. Those decisions will be passed to the stewards. And stewards will try to translate that policy into action so that people who are the subject matter experts, the SMEs, can help make changes in the organization and make it in a very, very positive role in order to do that. Again, there's going to be probably some feedback that comes back and new ideas that pop in. But you get the basic picture here. Again, data architecture, grounds slash constraints, all of these conversations. If you have these few diagrams that I've shown you in here, how these things work. And again, it's a combination of that very detailed enterprise architecture diagram and other types of close-up data model components and things like that. How should you think about this? Well, think about it as a fire station. I live in rural Virginia and we have a volunteer fire station around the corner. That means that no matter what happens, something's going to be different in order to do this. And there's a TV show called MacGyver that we like to use as an example of how data governance should be used. If you haven't seen this show, take a watch out at some time. Again, he puts duct tape to the most productive uses that you can have. Similarly, though, our firefighters don't just stop at the fire station. They do the heroics of MacGyver, absolutely. But when they have downtime, they are replacing things like batteries and smoke detectors. They're doing fire education. Again, many of you may not realize this, but the incandescent light bulb has now become the third largest cause of fires in indoor settings. It's a good reason to upgrade your light bulbs even if you don't like the fact that we can't use incandescent ones anymore. It's just eliminating fire sources around all of this. So it's a very big challenge with all of this. What we're trying to do is help organizations by understanding that the typical question that is asked is, where am I going to start managing all my data? And that's not a good way to think about it. Instead, say, well, I wonder what value, what scope, what qualifies something to be worthy of being governed by our practices. Now, I don't want to sound haughty on the process, but you see how it transforms it. Let's keep it. And most importantly, regardless of the decision, document why in some place that people can easily search and find this information. I liken this to, and again, you guys know my age, I graduated in high school in 1977 through McLean High School in McLean, Virginia, and they had this thing there that they called the seatbelt convencer. And it was a wonderful thing. It still is. You can find them at almost any state fair that you go to. Why am I harping on about this particular weird activity? Well, if you've ever sat through one of those, you realize that it simulates a five mile per hour crash. Anybody who has ever done that will instinctively reach for their seatbelts because this kind of thing you do not forget, it is literally the convencer. And it's difficult to insert this kind of diagram just the same way it's difficult to insert things into your data governance and data architecture components after the fact. So you want to have these planning sessions together. Defensive driving was a wonderful way of doing this, but my understanding is that it's not taught this way in schools anymore, which is a very sad sort of prognosis and probably more dangerous. We also are seeing tremendous increases in the data that goes around automobile crashes and they're going up at a crazy rate. I'm showing you a not to slide here. Don't try and read this. This is just the official definition from the PIMBAC Project Institute Managed Body of Knowledge talking about the difference between projects and programs. And your task as data professionals is to make sure that everybody in your organization understands that your data program is going to last as long as your HR program. There's simply no point in trying to do anything less in terms of this task. Do you think that data analysis is going to be required less in the future instead of now? Do you think that the volume of data for that is relevant to your organization is going to decrease? No. Both of these questions are no. It will continue to increase. It is something that you're going to have to pay attention to and the sooner the organization makes this commitment and says as marching orders to the data programmer, you'll find that in just a minute, the data program is what you should be using to help explain to everybody else. I'm going to give you a quick snippet here from my data literacy book around this. These are knowledge areas that we have at what we're calling a level three. And I'm going to tell you first of all a second elevator story. And that is a story about the MITRE Corporation. It's a very nice group. I've worked with them and they do wonderful work up there. This is just a story. It could have happened to anything. It's nothing specific about MITRE. But the data scientists that you see here pictured as a data whisperer was trying to document a business rule and the question was could a project be owned by multiple departments? And they kept asking the question who was the policymaker, the executive that was in charge, and the executive in charge says that's technical stuff. I don't deal with that sort of thing. And so the data scientists correctly went to the person who designed the database that kept this track of this knowledge and said where's this come out? And they said oh we've put it in there a long time ago. Multiple departments can own multiple projects. Well that's an important distinction that the executive was actually furious to find out that it happened because it was of course just for starters against policy. You can see here that the lack of data curiosity on the part of the executive here was a problem. Let me show you again this story is at least 20 years old so there's nobody still around doing that. They're a much different organization. I don't want to say anything bad about the group that I've worked with very well. Over the years. So what are some other knowledge areas, data literacy components that a citizen should have and as a knowledge worker in particular, not just as a data scientist, which is a wonderful category, but as just a knowledge worker in general. And yes you should understand the concept of data steward in there. It's very important. You should understand the importance of being able to demonstrate value on a reasonable basis in order to move forward. That keeping your skills and the data and the methods that you use current is absolutely key that there are certain fiduciary responsibilities that you already understand elsewhere in your life when you're dealing with matters of health and wealth and legal. These are areas where you expect a fiduciary responsibility. There are components around data here that we need to understand. Finally, the last concept just as a citizen knowledge worker. If we don't understand that most of the data organizationally is swimming in the same swimming pool, we are all sharing the same fate. We have a different challenge coming up in terms of how we understand this. I use the the word here though and I said it earlier on. Very important not to try and explain all of this to everybody. It just doesn't work. You've got all sorts of data things that you're doing of which the public will never hear. They might find out about your data governance program and your data architecture program. If nothing else to say why on earth were you listening to Peter for an hour the other day on this. And the answer is unfortunately people don't understand this. That's the wonderful sound of Charlie Brown's teacher if you remember that from the old days. They don't hear anything besides data. They don't appreciate this. So talk to them strictly about your data program in general. It's a much easier conversation. You'll find that people will pay more attention to it and understand it well. So let's look real quick at what we've covered as we get ready to wrap up here. Again, the idea that data has these weird characteristics that is being taught not properly and people really don't understand what's going on there has led to this very uneven understanding of it. We've tried with the DIMBOC to bring it all back into one place so we can start to use it. Data governance needs to be simple enough that executives will understand and useful enough that they'll want to pay attention to it. That they can tie it directly to costs of things such as rising organizational debt. And the difference between adaptive versus prescriptive type approaches is important to understand when you understand it well enough to be able to apply the right technique. Similarly, data architecture is everywhere and not well understood. It's one of the easiest things to derive from your existing environment. And there are many of the case tools that are out there on the market and services do this. We'll have to ask Chris and see if they do anything in that area. I think I saw it on one of the slides that I was looking at. The key for it is keep improvements in data architecture focused on strategy so that people understand how you're making a difference. Because if you didn't explicitly design your data systems to support strategy using data in the best, most optimal way possible, it's unlikely that they just happen to end up that way. You can't use, of course, what you can understand is the most basic component of all of this. And that by pulling these things together, by not looking at this as a traditional exercise, but instead kind of upending and saying let's really look at it from a defensive driving perspective, understand the importance of storytelling but don't bore them with everything that's going on in order to do this. So just a couple quick takeaways and we'll get to the Q&A. The first one is, of course, that the need for data architecture and data governance is increasing. You'll see a lot of data architecture jobs actually advertised today as data engineering jobs because the companies don't quite understand. It's good to understand the details, but you also need to understand the whole picture around this. Again, the increase in data volume, the fact that data architecture has been around for quite some time, but data governance is still relatively a new discipline. And it's probably going to be one that is personal to the organization. We use the word bespoke or custom fit around that. It's got to conform to existing constraints and there is no single best way at this point in time in doing this. Taking both of them together and focusing both of these activities on strategy gives you the opportunity to re-engineer or what people now call digitization opportunistically, finding the areas that can pay off on this and using it as a model so that you can fund future efforts as you've made savings out of the existing practices that are there so that you can quantify strategic improvements and so that you can implement all of this in a very concise, programmatic implementation. If it's just reacting all the time, you're just swatting it flies. If you don't take a disciplined approach, then I think the organization is really selling itself quite short. Next up on this is the idea that assured focus produces reusable shared results. It's the idea that if we approach this as a program, we expect there to be residual knowledge. One of my favorite CIOs that I ever worked for would tell me repeatedly, Peter, I don't understand what it is you and data modeling and data governance do, but I do understand that my team uses the diagrams that you produce all over the world and they're not just pretty. They are using them because they understand what we're doing. So the idea is to gradually add ingredients, get good at this, build up endurance. Nobody starts a journey of a thousand miles without a single step on this. And finally, of course, learn the value of appropriate level of storytelling because not everybody wants to hear the detail, but they love to get the law and order version, if you will, on that goal. Of course, improve the effectiveness and efficiencies of data governance, of data architecture activities over time. And the only way you'll do that is by building up a competent team so that you're able to do this. And then all of this is enabled by literacy. Now, the one factor that I have managed to discover in the last couple of years is that 80% of executives around the table will pay me under the table to make them smarter, more data literate. The problem is they don't want everybody else to know because they think they're the only one. My answer is, of course, rest assured four out of five of you would be paying me in order to do this. It's a sad situation and we've got to do everything that we can to try and get not just our knowledge workers, more data literate, but our executives, data literate as well. And with that, we'll move ourselves over into what's coming up. Next one, we're going to be looking at the what's in your data warehouse. And between that and the December one, hopefully we'll see some of you guys at DGIQ. Certainly the Dana crew will be up there and looking forward to seeing everybody and getting back in person to something. And then we have our sponsors coming back in December for data management best practices. And then we start off the new year with data strategy and we are at the top of the hour, which means it's time for your questions. Let's see what we've got. Hi, Shanna. Peter, thank you so much for another great webinar as always. If you have questions for Peter or for Chris, feel free to submit them in the Q&A portion of your screen. And just to answer the most commonly asked questions, just a reminder, I will send a follow-up email to all registrants with links to the slides and links to the recording of this webinar by end of day Thursday. So there's a couple of questions here that came in the chat while where people are typing in their questions in the Q&A section. You know, is the data strategy an input for data governance or deliverable for data governance? It should be a compliment of both, but I shouldn't have just jumped in there. Chris, maybe you'd like to hit that one too and forget what I said. Jerry will disregard, right? No, no, no, no. I agree with you. I think they're complementary. Obviously, I think a lot of what data governance does from the processes and procedures that you want to build around it need to help enforce strategy, right? And make strategy execute and actually happen. I've gotten a lot of mileage out of this because it intuitively, I say this, I'm showing the diagram. And the reason we show you these diagrams is for you to use them. You do not need to make up your own version of it when you're explaining this to somebody else to show them this. Yes, exactly as Chris said, what should the important things be that we're doing in data governance? And they should be governed by a strategy. And that constitutes plans and progress for what's going on in the use of that data strategy, the implementation of it, the data stewards as well. So again, great question. Thank you for that. Why is a data model being called data architecture? I'm pretty sure that whoever is asking that is asking, perhaps, well, something like this. And there are a couple of shortcomings of this diagram. And the questioner is absolutely correct. This is a picture. And most people use it as an icon to represent that kind of thing. It's a series of connections of data. And the collection of those connections is what constitutes most places organizational data architecture. However, it is equally as valid about to show it as a representation using, for example, Zachman framework notation. The reason I was going to say that this is only a picture is that it's always incomplete if we don't have definitions. Of course, I haven't even shown you a label on here. So there's much less of any definitions. The best we could come up with might be to say orange pertains to this and green pertains to that. But it is a reasonable question of saying, well, it can't be the case that our data architecture is literally the sum, the union of all of our data models in our enterprise. And the answer is, well, if you haven't done anything to it, then yes, that's where it is. And gosh, Chris, if that's not a spot for you to step in and talk about the legacy that the people have. No, not at all. I mean, I think there is confusion there about what is being represented as a physical data architecture is quite often the, you know, aggregation of all the data models, right? Because that's, that's that physical construct that's tangible, right? And the rest of it is a little more abstract. Obviously, architectures, when we look at it from the perspective of enterprise architecture and a and a data focus at that higher level, then there are also, you know, standards and policy and things like that from an IT perspective, and specifically data management perspective, that would also be really part of your package, if you will, of a data architecture. Again, another visualization of what we briefly hit on was the idea of you have, your architecture is going to be in one of the following states as is or to be validated or unvalidated, and then conceptual, logical or physical. And the sum of all of those can be included in an organization's architecture. Most organizations do not manage nearly this much metadata in there. And part of your decision as an architect, as a data professional is to decide what is essential in order for your organization to run, and then whether or not it justifies additional investments in those areas. But I think, again, everybody will agree, if you don't know what it is, I did that, I made it blank, right? It's very hard to make use of it. And so even if you're thinking about buying commercial off the shelf software, one of the things you can do is determine the fit with your existing process and existing data architectures of the proposed solutions. In fact, it's considered best practice to require that from the vendors to say, mandatorily, if you want to sell us the software, you must show us the conceptual, logical and physical models, whatever it is that you have of these things, so that we can determine their fit around it. Sorry, Chris, that's just a bit of a high horse for me. Any thoughts on the concept around that? No, no, you can't see me, but I'm shaking my head and trying not to say and interrupt you and say, yep, yep, yep. Go ahead. We're fine. Everybody likes to hear it. All right, Shannon, what you got? So, in my experience, architects tend to focus activities on principles rather than strategy, to remain neutral to strategy changes. Can you comment on this, Tennessee? That is a great question. Chris, you want to start there? I'm very opinionated and I don't want to dominate it. Please go ahead. No, I've seen this in my practice too. You know, I haven't always worked with software vendor. I was out in industry and healthcare around data management long before I ever came into industry. And I've seen this too, Liz. So, yeah, there's kind of the, I think where I saw this in my experience really is that there's principles that architects are comfortable with, a strategy they're not always aligned to or comfortable with. And when I've seen that, it was more a perspective of, you know, kind of lacking the vision, I guess, in some cases with some folks to actually understand and, you know, build and help associate appropriately the principles that they're invoking and upholding that support the strategy and not having the ability to connect those makes them unable to differentiate them in some cases that we've or at least it's harder. I mean, one of the things that we've done, I think rather poorly from an educational and a rewards perspective is that at least in the computer science area, we tend to reward newness. And so I will have my office at VCU is literally at the corner of engineering and business. And my my dear colleagues in computer science will come down the halls and they'll they'll they'll say to me, Peter, if you ever run across any data that looks like this, we'd really like it. And I say why knowing, of course, the answer is because I've got this great algorithm. And while they get rewarded for this, they will not be rewarded for it in the corporate world. If that whatever it is doesn't match whatever you need to have in the real world, I realize being very abstract here. But the concept of business value of the results is absolutely critical around this. I'll tell just one more quick story, Chris. Maybe you'll have one that you can toss on the end there. There's a data scientist who gets in the elevator at one point with one of the executives of the organization. And the executive says, So how's your project going? I hear you're making progress. And the answer comes back. Yes, we've moved it from 72 to 76%. And unfortunately, the executive turned into a absolute crazy person. The steam was coming out the ears. The face was red. It was Listen to me. You don't know what company hired you. Sorry, I'm bashing on the table here. I shouldn't do that. We never do anything here less than 100%. Now, obviously, a complete straight miscommunication that occurred here. And any component of, you know, good discussion would have set that off. But of course, it terrified everybody that was involved in the process. And what was even worse was finding out on the follow up that the 76% while they had hit that this year, they could have actually been making money on the 72% solution that they had two years ago. That was two years of revenue that was unearned because the individual who was working with it valued the results as the side of the context of the business value around this. Christian, I give you enough time to come up with one or maybe. Absolutely. I shared a little snippet in the chat while you were speaking and there was some great conversation there. I've had this experience. I've actually lived it. I've given a really crappy elevator pitch around specifically master data management in my lifetime and had those similar experiences of myself not being connected to strategy. Strategy is a business aspect that strategy is always about making sure we're most effective and efficient around reducing or managing risk better and preventing risk, making money and saving money. Those are the three things that a strategy at the end of the day really, really kind of care about. Those are the end results and outcomes. We have a hard time in data management across the board, whether you're an architect, whatever level you are, it's really challenging to attach to that business outcome and a lot of it is because it's not as direct. You do have to take a couple steps at the least. You're taking two steps to attach what's happening in the space and within the practice of data management to how that's enabling or helping to deliver actual business results. I'm just popping up the diagram in the background and they're saying that you said it's harder to understand what the impact is on the business, but if we don't practice at it, of course, we will never make that particular transformation. All that said, here's data things happening. Here's our imprecise mapping to organizational things happen and the better we get at that process of understanding things are happening here and that's making this happen over here, the easier it is to prove our value. Now, the other component I would say is that absolutely a purist could say I will not pay attention to strategy, but I don't find that most data architects are with all sorts of time and resources on their hands. Most of them are pretty darn busy and so the ability to do interesting things is limited to the ability to demonstrate things that show that at least they shouldn't be the next ones laid off in the next round of off things around here. It's sad to say, but it happens over and over and over again. Training goes and then they don't kind of understand this architecture stuff, so we'll get rid of it and you know if things don't get any worse, it must have been the right decision. Boy, that's a scary thought, isn't it? Chris, you're probably not old enough to know those sort of things, but we used to take reports and we weren't sure whether they were being used or not, we'd stop printing the report. If nobody complained, it must have been the right decision. How's that for running a railroad? I remember those days, I actually started doing cobalt, so that made it a little bit. Yep, that cobalt was my first assignment for I had a goal of not losing more than half the class. That was our low standard goal that we had in that. All right, we're reminiscing here, standard Shannon. Okay. At the end there, you had a nice segue there into the next question which is why is data literacy ignored? Is it a symptom of the emperor has no clothes since it requires data governance and having strong data management practices? I don't know what they mean by is it a case of the emperor has no clothes too. Chris, maybe you can read that. Yeah, I think it's in reference to your previous statement around executives not wanting to admit literacy problems and saying, hey, I'll pay you offline to teach me how to be literate. I think it's okay. That's only human nature. Come on. I mean, we really can't get mad at people for not knowing something we haven't taught them. What we've taught them for the past 30 years is that the relational model is the answer to all of your questions. We've left out important details such as, well, maybe there are other practices around besides creating relational data management systems such as master and reference data. They are rarely, if ever, addressed in curriculum around this. They form an important component of the technology landscape. If we haven't pulled, it's worse than this. Again, I've got another whole talk that I do on these areas, but we've taught for 30 years the one data course that you get, and Chris, correct me if you didn't get the exact same one, is that you're building a brand new database in the process. Am I right? Yeah, absolutely. Now, if you think about it, that's the only thing we taught people for 30 years. What did managers learn? I only need those people when I'm making a new database. Now, that's a problem. If they're connecting two databases, they don't think I'm creating a new database, and therefore I need those people with that skill. They instead say, I'm not creating a new database, so I don't need those people. And while I've got tons of graduates out in the world who are making money using helping organizations use their data better, it is still a problem to make people understand why you need this as a corporate capability in their very, very challenging environment. Second part of it is, though, if the only tool we've taught people how to use, how to do, how to operate, how to build, is how to build a new database, why are we surprised that we have so many new databases around? Why do we need products like what Chris has in order to address some of those issues? Because there's no way you're going to do it automatically. Again, I'll give you just a very quick statistic here, and then Chris, I'll shut up and let you head out and for a while, because you can see it's a great, great topic. One bank, fairly major, had a account of more than several hundred thousand access databases that were in production. Now, if you know anything about governance and things, you know that you probably don't want to trust your key data elements to production access databases, especially when Microsoft is subsetting the project over time, so generally not good. It took them 10 years of concentrated effort to be 100 percent certain that they had gotten 100 percent of those several hundred thousand production databases converted into less risky type environments. Talk about an enormous self-inflicted black eye, an amount of data depth that is just incalculable around this, although if I told you which bank it was, you could probably go back and do some calculations to find it out. Just incredible the amount of self-inflicted wounds that they had here. And again, remember the moral of this story here before we jump off of the slide is that architecture is an overlay that you can add to these governance activities or the other way around. You can add the governance activities as an overlay to architecture to help you to size and shape these things. And the reason going back to the other question, the reason it's important to incorporate strategy in there is because that's the only thing that's going to keep you in a job going forward. I hate to say it, but I've seen people who say, oh, that's okay. My CIO has given me five years to actually make a significant difference around here. Well, CIOs only last two to four years, so I'm not sure they're going to be around for that fifth year in order to see that. All right, I've babbled enough. Chris, your turn. Go for it. Oh, no. The public story is here now. No, no, no. I think you've put all the key points that I would hit. So let's go ahead and move on to the next question. Very good. So in a new role leading open data, I have the directive to show value back to the business since it costs them something to publish the data and we are required to in government. The real challenge is re-re-involving the business in the value discussion. How do you start these conversations? I can remember I worked for something called the Defense Information Systems Agency and one of the things they were doing in these early days, it was the early 90s, was introducing the concept of project management to government-run projects. Now, you can imagine that's just an improvement in general, adding some discipline around it instead of it being sort of willy-nilly or the way everybody thinks about it, which is by the way, the way data literacy education happens in today's environment here as well. The idea was understanding your project and then, of course, with your project as quickly as you could, figuring out whether something was on the critical path or not. And if it was on the critical path, you mean that you had to make sure that that didn't slip or it would slip the entire length of the project. That's the definition of it. It gives back to our projects versus programs type of definition around here. And the idea of having a critical path gave people the idea that there had to be some way of doing it correctly or incorrectly. If you were working on non-critical path elements when you had slack and needed to work on those critical path elements, it made a difference. And it gave sort of a sense of right and wrong, a scientific, robust discipline to the process. Similarly, we need to express upon these as well to say that as we're looking at these various components here, only when we get synergy are we really realizing this. And again, I'm doing this based on these two for this particular one, because it's a very popular topic. But if we bop up here to our dimbok, it could be just as easily. I said it before. There we go. Oops, I dropped it. So I'm a class with the cursor here today. And there we go. Okay. So again, what we were looking at from this perspective here was to say that governance and architecture were going to be important to the process. It just as easily could have been the same topic here focusing in around reference and master data and data governance. Nothing would be different other than we would be focusing different aspects of it on within reference and master data management. You have a different set of foci that you do inside of generalized data architecture. But I also, again, maybe Chris, this is a place for us to have a discussion, would tend to see that reference and master data would be an instance of data architecture, a further refinement of data architecture concepts within there. No, I would absolutely agree with that. I think each one of these concepts from a data management perspective and in the tools and technologies that we want to have in place, as well as the actual data itself, of course, are what you want to consider in your strategy, and especially to have actual governance policy and standards and things like that written around that and within that perspective. I think the wheel here adds a lot of value in being able to categorize some of that and bring some of those concepts to something that is much more tangible to folks outside of the data management sphere. And I kind of glossed over my point of the blah, blah, blah, the Lucy or the Peanuts teacher talking around this and just simply telling people, don't try and explain all of this to outsiders at first. It's overwhelming. It's interesting to us as data professionals, but probably as business professionals, they're going to need to actually understand that value proposition that sort of where we started off on this, how do we figure out how do we engage them into conversations that do this and finding those critical paths was when we did it with government projects. If you're looking here, what you're trying to say is where are the data sources that are going to cause some problems or impact my quality or make it faster, better cheaper or prevent us from making it faster, better, cheaper, less risky all the way around to it. That conversation is not one that most people have on a regular basis, but the sooner you start, the better you will become at it. Again, it's one of those things you don't want it to be having to have that conversation at a critical time where communication is at a premium, but instead have it be conversion already in that language, so that people are using it. I mentioned before several insurance companies, I should have it in this deck and I don't, but it's in one of the other decks that we use for these slides just shows major subject areas in the business and how they're officially used by all the systems and therefore all the business practices that surround them in order to come up with this is a very useful piece. It's a simple diagram. Everybody has one on the corner of their desk or somewhere they can pop it up on a digital copy very quickly when they need reference, but they all learn the same set of constructs around this and it makes them much more efficient overall and able to move to value-based discussions much more quickly than just sort of going, well, let's start with the A is how valuable is being facetious there. Anyway, Chris, your turn. Any thoughts on the business part of the edition? How do we engage them? Yeah, exactly. I think a couple of the key concepts and maybe I can just Liz answer with a few tips, things I learned by doing it the wrong way many times right through my career is one just my experience, but it seems to resonate when I talk to folks is it seems to be a habit and probably a bad one of IT when we execute our projects as IT professionals. We tend to forget to bring the data stakeholders along. They open up the project, they're part of the excitement of the initial go live or start of the project and then they're gone while IT does the work and then we need to exactly the word here that you used was re-involve and that's always really, really hard to bring that back into the fold. It's much easier if we keep them on board through the execution of whatever we're building out as owners or contributors to the data, their stakeholders from that data perspective. Ways to do it after that since you haven't, so a couple of tricks is follow the data. So I literally mean where did it come from and where is it going? And you can follow where it's going to get to where the business value is because the reality is we do all this data management. We have data governance in place, we have data architectures in place, but where value is perceived is when data is consumed and it's in that business here when they're using the data and they're consuming the data that the business understands value. Getting to that if you follow the data, you'll get there eventually to understand what do you enable and what do you make better and improve within a business process and you can attach to the value that that business process supports within the organization. One of the easy ways to kind of get there and open the door to that conversation are issues. While we never really want to talk about defects or issues, things like that, especially when it's a data quality type of issue, these are great opportunities to to see the white space and understand what's not being talked about. You can understand the impact, so just ask about the impact. What can you not do and what does that do? Okay, and then and then and then and you kind of just keep asking those and then questions around an issue that's happening. Somebody's impacted, but what does that really do? And oftentimes what you get is to a point where you understand, well, when we don't have the data, the loan process or the loan application process fails and on average when that happens we lose about, we don't convert about 1% of those applications. Okay, well how many applications do you get a day and how many loans? A billion dollars a day we had one customer tell us at one point and immediately we were able to turn that impact statement into a value statement because if they can't do something because of data that your provider and things that you're providing, then it means that you're enabling them to be able to do that. So you own some of the value that that business process is bringing the organization. You just got there through a slightly different open door, right? And you flip those concepts of the way that bad data impacts by saying, oh, bad data impacts, you've been good data enables you. I am pretty much speechless because it was 100% dead on and perfect. And if you guys didn't catch that, he gave you some really good advice, probably saved you about three weeks of consulting time from one of the big guys to come in and give you the same amount of really good advice in a very concise session there. So gosh, thanks for that. Really very specific ways of going in and finding out about the values. Luckily, these things are recorded. So you guys can all go back and play that. Wait a minute, I missed that. Could you do it again, Chris, if you needed to, because it was perfect. Yeah, absolutely. Fantastic. All right, Shannon, we got time for one more. Should we put while we're ahead? Let's see if we can slip in one more here with the couple of minutes that we've got less. When we talk about data architecture, are we talking about existing systems or architecture in the data warehouse? Can we talk about architecture with all systems with our data in place? You said architecture can't be accomplished after implementation. So key to that is, unfortunately, it depends on what you're focusing on. Yes, all of it is included in your organization's data architecture, but it probably follows that Pareto distribution. Again, Chris, you can tell me whether your experience confirms this or not, but we're 20 percent of your entities take care of 80 percent of the functionality that you have in your organization. And by looking and focusing your efforts on those areas, that's probably your important data architecture as opposed to your data architecture around that. I know Chris thoughts on that. No, you said it. So I really don't have much, you're exactly right. 100 percent. Keep looking, though. Anything we can do to improve? All right, Shannon, we got another minute. So how do we incorporate data ops into the whole process? So you want to go first, Chris? No, go for it. This is a challenge. This could be a whole another hour and a half, right? Absolutely. And it's a very interesting topic. The key to understanding about data ops is that it's about optimization of existing practices. And if you have existing practices that you're ready to optimize, data ops makes a lot of sense to start becoming good at. But if you haven't yet defined your architecture around these topics, it's probably a premature waste of effort at this point. Because until you have something that you can actually optimize, data ops is going to be much more about forming and storming than it is about optimizing. And the real value in it is when it comes down to taking existing repetitive practices and trying to put them over and over again into the same areas. Any thoughts on that, Chris? I mean, it's a process, right? So I think data ops is a great example of something that just across the board philosophically, we think of IT as separate from the business. Data ops is a great way to visualize that IT is a business function. Data ops is part of that critical business function. And it does need to help impose policy and enforce policy and standards as well as act on strategy and change and evolve to support strategy. All right. Well, Chris, I've enjoyed our conversation today. Look forward to the next one in December. It's a chat with you on Shannon. Thanks so much. Thank you both for this great presentation. But that is all the time that we have for today's webinar. Again, just a reminder to everybody, I will send a follow-up email by end of day Thursday for this webinar to all registrants with links to the slides and links to the recording. Again, thanks to Reltio for helping to make this webinar happen. Chris, thank you for joining us. It's been great insight here. Thanks, y'all, for being so engaged. Hope you all have a great day. Thanks, everyone.