 Hello and welcome. My name is Shannon Kemp and I'm the Chief Digital Officer of Data Diversity. We'd like to thank you for joining the current installment in the monthly Data Diversity Webinar series, Real World Data Governance with Bob Siner. Today Bob will discuss why is governing data quality so hard. Just a couple of points to get us started. Due to the large number of people that attend these sessions, you will be muted during the webinar. If you'd like to chat with us or with each other, we certainly encourage you to do so. And to note Zoom defaults the chat to send you just the panelists, but you may absolutely switch that to network with everyone. For questions, we will be collecting them via the Q&A section and to find the chat and the Q&A panels. You may click those icons in the bottom middle of your screen to activate those features. And as always, we will send a follow-up email within two business days containing links to the slides, the recording of the session and any additional information requested throughout the webinar. And to note, that means two business days is the end of day Tuesday for this webinar, the final webinar of the 2023 season. And now let me introduce to our speaker for the series, Bob Siner. Bob is the president and principal of KIK Consulting and Educational Services. Bob specializes in non-invasive data governance, data stewardship and metadata management solutions. And with that, I will turn it over to Bob to get the final webinar of the 2023 season started. Bob, hello and welcome. I believe it's December 21. Now, I mean, it's this year seems to have flown by really quickly. And again, just I want to thank everybody for taking the time out of their schedule to attend either live or to to watch the recording of the webinar. I just wanted to share with you something that I just learned as well. And I think we need to give a round of applause to Shannon because this is her 899th City webinar that she's done. So it's not the 900th yet when we start the 2024 season, whoever's up first will be her 900th. But we're already planning for 2025 when she hits her milestone of 1000 webinars. So thank you, Shannon. This has been an incredible run. I love the way that you introduced this, this specific webinar, because the way you said is why is governing data so hard, and you can fill in the blank, so to speak as to whether it's hard. I see people say why is governing data so gosh darn hard, but that would be in the clean way of being able to say it. I've had conversations with clients recently where they talk about how they want data governance to really have a huge impact on on governing quality on improving the quality of the data within the organization. But they seem to think that that's one of the most difficult things to do they find the other aspects of or business outcomes that people are trying to get from data governance are easier to achieve than governing data quality. So we're going to talk today about why is governing data so hard. And I want to lastly on the first I just want to say happy holidays to everybody have a Merry Christmas. Happy Hanukkah. Happy whatever holiday it is that you celebrate and a happy new year to all of you so great to have you here. It's time to get started we only have an hour for these webinars and I've tried to pack this webinar with a whole bunch of information that I hope that you'll be able to refer back to and you'll be able to use in your area of employment wherever you are in the world or whatever role you play within your organization. So obviously you know about the real world data governance webinar series. We've been doing it for 12 years dating back to 2012, and we will be doing a phenomenal series, at least a phenomenal series of topics in 2024 as well. It's just to let you know so next month on the third Thursday of the month we'll be talking about data governance and data management and untangling those things. We're going to talk about optimizing governance with frameworks we're going to talk about data governance roles, metadata management. We've got an incredible list of topics for the 2024 season, so to speak, and the first one like I said will be, you know, the next month in January will be on data governance and data management untangled, because a lot of organizations look at those two. And I even had to question the other day on LinkedIn, which one falls above the other data governance or data management. So that should be a fascinating webinar I'm looking forward to that one. There's a couple of upcoming events that I want to share with you. Just announced, if you're on LinkedIn or you're receiving emails from diversity, you probably learned that they've announced their agenda for the enterprise data governance online, the EDG event, which is taking place in January 24th of next month, and I'm going to be talking about data governance and master data management and how they collide and how we can get them to work closer together. I'll also be in Orlando in March speaking about activating data stewardship and data governance roles at the Enterprise Data World event in March. I talk a lot about noninvasive data governance. I talk, and so I just to let you know, if you weren't familiar, I wrote a book on noninvasive data governance back in 2014. And the second book, which wasn't a sequel, it was basically lessons learned from the first book came out in May of this year. I have online learning plans that are available through Data Diversity, one on noninvasive data governance, one on noninvasive metadata governance. And the one that's most recent is the Business Glossary's Data Dictionary and Data Catalogues in 2024, early in 2024, I hope to be adding to that list of online learning plans available through the Data Diversity Training Center. KIK Consulting and Educational Services is the name of my consulting business. KIK is not my initials. My initials are BS and I didn't want to call myself BS Consulting, but KIK stands for Knowledge is King and I focus on knowledge transfer. And it is the home of noninvasive data governance, since I seem to talk about it so much. And in my spare time, I'm also an adjunct faculty member at Carnegie Mellon University, which is right here in my hometown of Pittsburgh, Pennsylvania, in their chief data officer certificate program. So again, I keep quite busy. I'm really, really happy to have you with us today. There's a lot to talk about here. So let me talk about, as I typically do, just to provide you an abstract of what we're going to talk about over the course of the next 45 minutes or so. I want to talk about, first of all, what it means to govern data quality. And I'm going to start by just by describing what it means to govern anything in general. And then what does it specifically mean to govern data quality. We'll talk about dimensions of data quality and there's a lot of different sources, resources to go to to learn more about dimensions of data quality. And then we're going to talk about specifically governing the quality by the different dimensions that I'm going to share with you in that part of the webinar. And then we're going to talk about making the governing of data quality easier, because as you know, the name of this session is, you know, why is governing data quality so darn hard. I'm going to share with you some ideas at least that I've seen be effective within organizations for making the governing of data quality easier. And then the last thing I'm going to talk about is improving outcomes, improving those business outcomes through improved data quality within the organization. So we've got lots of things on the plate for today for the final Data Diversity webinar of 2023. If you've attended my webinars before you know that I like to start out just quickly with a bunch of definitions. And my definitions are unique to the industry at least I think that they are where I've seen a lot of a lot of definitions for data governance. I defined mine I put some teeth behind it I say that data governance is the execution and enforcement of authority over data. And so at the end of the day, no matter what approach you take to data governance if you take a top down command and control approach or a traditional. If you build it they will come type of approach or a non invasive approach at the end of the day, we need to execute and enforce authority over the data. And we'll talk a little bit about how to go about doing that specifically focused on data quality in this webinar. I define data stewardship a little bit differently too I say and you've probably heard me say before that potentially everybody in the organization is a steward of the data. If they have a relationship to the data and they're being held formally accountable for that relationship to people that define data produce data use data. That's pretty much everybody in the organization so if they're being held accountable for how they define produce and use data. It's not something they can opt into or opt out of they are a steward of the data. And so my definition of a data steward is a person that's held formally accountable for whatever actions they take whatever relationships they have to the data. My definition of metadata is it's not data about data but it kind of it is it's data that needs to be stored somewhere in an it tool that improves both the business and technical understanding of the data and the data related assets. What I want to focus on today is data quality. And I don't have a specific definition of data quality that I use those top four definitions. They can be attributed to me because I think I'm one of the only people that use those as my definitions. This definition was pretty general and it came from tech target is the data quality is the measure of the condition of the data. Based on the factors such as the dimensions that we're going to talk about today accuracy completeness consistency reliability and whether it's up to date. That's the definition that came from from from tech target. And so there's lots of definitions out there. I'm sure that you can search the webinars and such from diversity and they've got another definition, but we're focusing on really that measure of condition of data. Like I said, there's lots of different definitions pick the one that makes the most sense to your organization. All right, let's jump into the core of the content today. The first thing I want to talk about is, well, what does it mean to govern data quality? What does it even mean to govern anything? Because I know that from my experience, I've had a lot of organizations that come up to me and say, okay, well, we understand data governance is important. Everything we read, everything everybody says tells us that data governance is important or that governance is important. What does it truly mean to govern something? So I'm going to spend a minute on that. And then when that's something that you're governing is data quality. How does that change what it means to govern anything? And then I think you'll see they're very similar. We'll talk about the results of governing or not governing the quality of your data. We'll try to attribute it to the expense of inadequate data quality, which would give us the reason as to why we would think about focusing specifically on governing towards data quality. And the last thing in this section is going to be that the data quality will not govern itself. Again, if you've attended these webinars in the past, some of my favorite sayings are the data will not govern itself. The metadata will not govern itself. The data governance program will not govern itself. Well, certainly data quality is not going to govern itself. Somebody in the organization is going to have to be responsible for it. And, you know, one of the things I hope you'll take out of this, this entire webinar is that when it comes to governing data quality, it's going to require effort. It's not going to happen on its own. It's going to require people. It's going to require that people are engaged and we'll talk a little bit about who some of those appropriate people are, but it also requires a plan. And I'm hoping, like I said earlier that you'll refer back to this slide deck and think about it as you're putting together a data quality plan for your organization that there will be tidbits of information in this deck for you to be able to refer back to. So the first thing that I said I wanted to talk about what it means to govern anything, what it means to govern something. And I really look at there being five kind of core pieces as to what it means to govern something. It means, and you can think of it in terms of government. Just think about it as the rules of the road wherever you're driving somewhere. Well, we know in the US we drive on the right side of the road. If we drive on the left side of the road, there's going to be a penalty for that. There's rules that say in the US we drive on the right side of the road. So yeah, you're going to get in trouble if you drive on the left side of the road in the US if you don't run into somebody and kill yourself or somebody else by not following the rules. So when we think about what it means to govern something, it means that we need to formalize rules, formalize policy, formalize compliance. How are we going to make certain that the rules and the policies are being followed? Formalize decision-making authority when it comes to governing anything. Who's going to be the person in the organization to make that decision? Formalizing accountability and responsibility. Again, that was my definition of what stewardship means. So we need to identify and we need to recognize who the people in the organization that are the stewards are, and help them to be held formally accountable and responsible for the actions they take with data. And then the last thing of what it means to govern something is that we need to formalize improvement. Again, as I said, the data quality is not going to govern itself. It's not going to improve itself. I wish I had a can of pixie dust that I could send out to you, sprinkle it over the organization and fix your problems. It's not going to happen. It's going to require effort. It's going to require people. It's going to require a plan. So now let's talk about what it means to govern when that something is data quality. So actually with each of these five bullet points under the statement of when that something is data quality, you could add the word quality at the end. But I kind of pulled it back a level and say, okay, it's to, we need to formalize the rules and the policies, just like the previous slide said, but doing that specifically for data or doing that specifically for data quality, formalizing compliance for data or data quality. Again, it means that we need to pay attention. We need to put some rules in place. People need to understand those rules and they're expected to follow them and that there's some level of accountability and responsibility for following the rules that are being set forth for something like data quality. So I wanted to kind of break it down to be somewhat binary in the fact that, okay, well, what are the results of governing data quality, and then, okay, what are the results of not governing data quality. And if you'll notice they're just the polar opposite of each other. So if we were live, if we were in person, I'd suggest that you take a print out of this page and go down item by item on the first item for the two bullets, the second, the third, fourth and fifth and say, okay, well, what do we have in our organization. And if we're already governing data, what has been the result of putting our governing program into place? What has been the result of governing specifically data quality? Has it resulted in accurate data, consistent data, relevant, timely, compliant data? Or is it, are you not doing these things? And because of that, the challenges that people are talking about are that they have inaccurate data, or that the data is inconsistent or obsolete, or it's delaying their process of being able to make decisions, or it's not compliant. So it's really easy to be able to say, okay, well, if we govern the quality of the data, well, we know we can't do it for the entire organization at once. We're going to need to start small, do it incrementally. We need to focus on certain things. And one thing is to provide accurate data, consistent data, relevant, timely, compliant, because otherwise, like I said before, the data is not going to, the data quality is not going to govern itself. We can almost expect that if nobody is responsible for this, that we're going to have inaccurate data that we're going to have inconsistent and obsolete data. I love that the graphic that I added at the bottom of this slide, because it sounds like me in my rebellious days as a youth, just tell me something that I'm not supposed to do. In these days, you know, people say I will do it twice and I will take pictures of doing it. Now we want to tell people that we need to govern quality. We need to work them into the equation and make certain that they're participating in a role based on, as I said earlier with stewardship, based on their relationship to the data. If they're a definer of the data, okay, let us work with them to define data quality standards and those types of things. And that's just one example. So when we're talking about what it means to govern data quality. Well, the first question often is, well, why do we need to do this? Why do we need to govern data quality? Well, number one, I already said that your challenges and that your issues and your opportunities around data quality are not going to resolve themselves. But we oftentimes want to articulate in dollars and cents what's the impact of the organization on inadequate data quality. And one of those, and I've listed several of them. And again, I was just going to go through them quickly with you, but there's several pages of these of things that we can look at as to let's justify the investment that we're putting into data quality and recognize that if we don't do it, there's going to be significant expense. And one of those expenses is in here at all this time for all the time from organizations that there's operational inefficiencies that they're spending additional time cleaning the data. If you've heard the term wrangling that people are wrangling the data. And oftentimes you hear about the 8020 rule people are spending 80% of their time just pulling the data together. And these folks that are data analysts, they're only spending 20% of their time. And again, these are just rough numbers, but they're spending only 20% of their time doing what they're good at. And what they're good at is the analysis of the data. So right now there's operational inefficiency. That's one of the expenses of inadequate quality. There's missed business opportunity. Now getting the proper or the correct information or the reliable information into the hands of the people that are making decisions for the organization. Without doing that, how do we expect that we're going to be able to take advantage of all these potential business opportunities that are coming our way. So again, one of the expenses of inadequate data quality is the missed business opportunity customer dissatisfaction. I know a lot of organizations that are focusing on improving their customer satisfaction and inadequate data quality can lead to errors and bills. I don't know if you're familiar with a friend of mine, Merrill Albert, who wrote crimes against data. Just provided an incredible list of data quality issues and and episodes with organizations where data quality has led to customer dissatisfaction. Now I did some research on some organizations and where there's been direct impact of poor data quality and customer dissatisfaction. It's amazing that some of the largest businesses in the world have survived them having poor data quality for so long. And a lot of them are now starting to focus their efforts on governing data quality within their organization, putting together formal governance, but focusing that governance on improving the quality of the data. Here's some additional items. The expenses of inadequate data quality is that there's going to be legal consequences in your organization. If the quality standards or if the quality regulations or just regulations in general are not being followed, it's going to be an expense to your organization. So there could be legal consequences for failing to maintain high quality data. Reputation damage. I mean, there's a lot of organizations several, I mean, if you recall back to 2013, Target took a big hit organizationally when they were having quality issues with their data and protection issues with their data. Somehow they recovered and they're focusing significantly on improving the quality of the data within the organization. So as I said earlier, the data quality will not govern itself. And by that I meant, as I said, it's going to require effort. It's going to require people. It's going to require a plan. Some of the things that need to require human intervention interaction when it comes to improving data quality is the interpretation of the context of the data. It's the quality assurance and the validation of the data. And as you can see here, I'm telling you that human intervention is essential to validate data. Yeah, we'd love to believe that we could use tools to do this for us, but ultimately it's going to come down to some level of formalized accountability for this. And there's going to be somebody behind it to think that we're going to automate this completely. I'm telling you the data quality is not going to govern itself. It's going to require, it's going to require those things, the effort, the people in the plan. Quality assurance and validation, decision making and strategy alignment, all of these things require a level of human intervention. The data quality will not govern itself and without people and without an effort and without a plan, the chances are that your quality is not going to improve all that much. Even in a specific use case example, it's going to require people. It's going to require a plan. Certainly going to require effort. And what is that effort? That effort really focuses back on governing the data quality. And here's two more of data, actually go back to this slide. You know, I talk about the quality will not govern itself. So yeah, here's two more examples of where humans are essential, where we've got to engage people. A good friend of mine who is also a speaker at a lot of diversity events, Len Silverstone once said to me that we shouldn't even call this data governance. We should really be calling it people governance because it's the actions and the behaviors of the people within the organization that we need to focus on the most. And I think this backs it up. It says, when I say that data quality will not govern itself, there are humans, there's human behavior that's required to be altered. If we're going to improve data quality and implement a program or a plan for consistently building quality data and building the level of confidence people have in data across the organization. So this is a subject. This next subject that I want to talk about are the dimensions of data quality. And it's interesting because I have created and I've shared in other webinars and other presentations for diversity in other places. I've shared a data governance framework that I'm working on that I'm digitizing that I'm doing a lot of things with. But it's really only two dimensions. It looks at the core components of a successful governance program as it's being viewed by the different levels or perspectives within the organization. It's very two-dimensional. And even when you look up what dimensions of data quality are, you're going to get a finite list of what data quality dimensions are. But the fact is that even considering the data governance framework that I just mentioned briefly, if you think about the roles component at the tactical level, yeah, those are your two dimensions and you're coming to a point where you're focusing on something. But then when you get to the point where you've got organizations of different sizes and different industries of different governing philosophies within their organization, there are a lot of different dimensions in general. So what we're going to talk about here, we're going to talk about the dimensions of data quality and why those dimensions are important. Talk about some of the challenges associated with governing to the quality dimensions. And then I'll wrap this section up with demonstrating data governance value by those dimensions. So if you do a search on the internet or your favorite generative AI engine for a list of data quality dimensions, you're probably going to get a different answer depending on everybody you ask. But there's a lot of consistency between the definitions as well. The ones that I've listed on the left side of the screen, they're the ones that you see quite often when people list out what the quality dimensions are for data accuracy, completeness, consistency, timeliness. We're going to go through each of those here in a couple of minutes. There are others that pop up on the radar when you do, excuse me, when you do request to see what are the dimensions of data quality. There's relevancy, granularity, all this. I'm not going to do as deep of a dive into those. I do want to focus on the traditional ones. And I think that if you're familiar with Dama and the DMBock and the CDMP, that there's a pretty finite list of data quality dimensions that they'll talk about in that level of education and training that you might be looking for. So as we're talking about the dimensions of data quality, let's talk about why the dimensions are important. Well, in a nutshell, they're basically important because the dimensions provide a framework to measure and manage. Remember when I talked about the definition from tech target of data quality to measure and to manage how accurate your data is, how complete your data is, and those types of things. So just want to answer a quick question, answer to the question of why are dimensions important? Well, it's because we need a framework to manage and to measure things within the organization. And so what are some of the things that we're trying to improve or we're trying to improve decision making, and we need to be able to connect how improving the quality of the data is going to improve decision making. And again, I don't want to read the blurb after improved decision making, but you can take the time and look through it. I wanted to highlight the part that when data is accurate, complete, consistent, timely, all of those dimensions that I just talked about, it can be used. And I didn't even go on to underline the rest of that sentence. It can be used by people for whatever they need to use it for identifying trends, understanding patterns, and those types of things. You know, the dimensions can help to identify where there's some lapses or where there's gaps in processes within your organizations to help you to identify and to address specific problems focused around efficiency and effectiveness of processes within your organization. The dimensions are important because you can use it to enhance the customer experience. They can have a few more of these. Increased compliance. I mean, having the information around being able to, in an auditable way, be able to demonstrate that the data is complete, that the data is timely, that it's accurate, that it's following the rules. Again, these are the dimensions of data quality and they can be used to help you to achieve or increase compliance within your organization. Again, I don't want to, I have a lot of slides that I want to make certain that I leave time to get through all of them, but they can improve governance, they can reduce risk, they can improve data sharing, and I hope this information will be helpful to you. I tried to highlight by underlining those things that you really want to focus on when you're thinking about why the dimensions of data quality are so important. And the last set that I want to provide to you is, you know, what can it do? You know, it can help to increase ROI by improving efficiency, increasing sales, all those types of things. Competitive advantage is something that organizations are trying to achieve also by governing to the quality dimensions of data. In the last session, the session that I gave at DGIQ East in DC a couple of weeks ago, I talked specifically about using data quality to build trust. I talked about something called a data confidence level and how it's important to be able to judge and manage what the confidence is that people have in data. Because if you can build that trust in the data and you can do it through the data quality dimensions that we're talking about here, excuse me, you're going to find that people will understand the data better and they'll be able to make better use of the data. Alright, so let's move on to what are some of the challenges that are associated. There's a lot of challenges associated with governing quality, these quality dimensions. The first thing is to define the appropriate ones for your organization. We're going to monitor and monitor to those quality dimensions to balance competing priorities because we all know that there's not just one priority. We'd love it if the only priority for the organization was to improve the quality of the data, but that's not true. That's never been true. So we need to balance what we're asking people to do with the competing priorities in the organization, enforcing the quality standards, adapting to change. Again, I'm going to go back to something I said earlier. In order to address any of these challenges associated with quality, with governing to the quality dimensions, you're going to need, it's going to require effort, it's going to require people and it's going to require a plan. So let's talk about the part because I know a lot of you may be practitioners of data governance or be responsible for data governance programs or aspects of your programs within your organization. How can we use data governance to demonstrate value by governing to each of these dimensions that I just talked about. I'm going to share some examples of things that you can do when it comes to accuracy demonstrating that the data is accurate and conducting, you know, audits of that data against defined standards. But in order to know if your data is correct or incorrect, you have to have a definition of what's correct versus incorrect. So you need to define these standards, you need to implement these standards when it comes to completeness, defining and documenting what data is required within the data sets, implementing quality checks. All of these things are ways that you can demonstrate governance value by governing the quality of the data associated with the different dimensions that we talked about, at least the traditional dimensions that I talked about. Talking about consistency, what are some of the things that you can do to demonstrate data governance value? Well, establish data formatting standards, implement data quality checks, use data reconciliation tools to see how data is the same or different across systems. And from a timeliness perspective, again, I don't want to read through each and every one of these. I hope there'll be a valuable resource to you. Because we're always being asked or at least a lot of the organizations I work with are being asked to demonstrate data governance value and data quality is one of the ways that organizations are asking for that to take place. And again, I also provide some bullets that go along with each of the last three dimensions that I was talking about. And that was the validity of the data, the uniqueness of the data and the integrity of the data. There are specific actions that you can take right now in your organization. I'm going to kind of sum up the webinar by talking about what some of those actions will be. And just kind of going back to something that I mentioned earlier. Any issues that you have around validity or uniqueness or integrity or any of these different dimensions of data, data quality. The issues around the challenges that people have around these things and the issues that they're causing in your organization. They are not going to correct themselves. So we need and that might be one of the reasons why governing data quality is so hard because we need to make certain that we're focusing on things where we can demonstrate value to for the organization. And again, that's why I'm trying to hope I'm hoping that some of these bullets on these last couple slides will help you to hone in your data governance program. On certain aspects or certain dimensions of data quality within your organization. We're going to run out of time. So I'm going to go relatively quickly through a bunch of slides. Again, like I said, I hope this is helpful to you when we talk about governing the quality of the data by dimension. Again, if you've attended my webinars in the past, or you've seen me speak or read what I've written, I always try to break the actions that people can take with data down to three simple actions. People can define data as part of their job. They could produce data as part of their job. They can use data as part of their job. Chances are they define and or produce and or use data as part of their job. And if you go back to my definitions at the beginning of the webinar and how I defined a steward, if they're being helped, if people are being held formally accountable for how they define produce and use data, they're stewards. They don't have an opportunity to opt in or opt out. For example, they use sensitive data. They're expected to follow the rules associated with using the sensitive data. You can't say, yes, I use sensitive data. No, I'm not going to follow the rules that won't fly in most organizations. So I'm going to kind of just go quickly through governing qualities through dimensions. But I also wanted to add that metadata quality also as a dimension as well. And we need to consider that. And then as I've been saying throughout, the data quality will not improve on its own. We need to take action to do that. So when we think about the data definition quality dimensions, accuracy, validity, uniqueness, completeness, consistency, all of those things are really dimensions of data quality that truly have a connection to the definition of the data. I talk a lot about things that I refer to as cheeseburger definitions. A cheeseburger definition. The definition of a cheeseburger is that it's a burger with cheese. The definition of a student account is that it's an account for a student. The definition isn't saying anything more than what the field name itself is saying. We need people to accurately represent the data. We need people to provide valid values. We need to make certain that that data is unique. If you're going to govern to the data definition. Here's just some considerations of the data quality dimensions to focus on. And then the same thing holds through for data production and data usage. And again, I don't want to go through them in a lot of detail. But you're going to need to, from a production perspective, govern the integrity, the timeliness, making certain that data is being completed on time and the completeness of it and all of these different production dimensions. If we're thinking about what we want to govern the quality of the production of the data, well, it's going to rely on the production, on the definition of the quality data, but then these dimensions are going to hold true primarily for the production of the data. And in the usage, again, the list being relevance, transparency, security, interpretability and efficiency. I mean, again, these are some of the definitions that I shared with you earlier in the webinar tied directly to data usage. And the last one of these that I want to talk about is the metadata quality by dimension is the completeness of the metadata, the accuracy of the metadata, the consistency, the timeliness, the accessibility. So when we're when we're governing quality by dimension, yes, we should be thinking about the definition production and usage of the data. But we also need to be thinking about what information is going to be useful to improve the data confidence level, the DCL for people of data. Sometimes it's going to be the context, it's going to be the metadata, and the more that the metadata is complete and accurate and consistent and timely, and it's accessible to people, the more valuable it's going to be to the organization. And just one last slide here on the governing quality by dimension. I got a couple more slides to go through here, but just again want to drive that point home the data quality will not improve on itself, or on its own. And so there's things that we need to do we need to recognize that if, for example, organizations are focusing on AI and data centricity. They're going to be focusing on the quality of the data in those sources that are going to feed into AI and into becoming a data centric organization. So again, I know I've said it a lot of times I said I was going to say it a lot of times, the data will not govern itself the metadata will govern itself, the data quality will not improve on its own. So let's spend a couple minutes talking about what what are some of the things that we can do to make the governing of data quality easier for our organization. Let's run through the first five bullets. And those first five bullets they seem to make it sound pretty easy. But again, you're going to need effort you're going to need people you're going to need a plan in order to execute on all the actions associated with each of these. These actions basically they're focused on on making the governing of data quality easier so let's run through them quickly data recording data quality challenges. We need to define the data quality dimensions that we're going to use and I've basically based everything that I've talked about so far on on those data quality dimensions, but we need to have a clear definition of what they mean to our organization and we need to start filling them in. We need to implement data collection mechanisms implement a quality repository. And in that quality repository you might want to consider keeping tracking your quality issues there and the progress of those get people to become accustomed to using a data quality repository or a database to be able to go to be able to provide it will provide an intake process for people to be able to report their data quality issues or their data quality opportunities to improve within the organization basically empower the stakeholders to identify and address data quality issues. The second action was to prioritize data quality issues, and oftentimes in organizations, the data governance council or the DGC or data governance committee whatever you're calling your strategic level, basically the end of the line when it comes to making decisions in the organization and the ultimate decision makers know we need people to help us to prioritize data quality challenges and opportunities. So we need to be able to assist them to assess the impact, assess the analyze the feasibility of being able to solve the problems. The lining with strategic objectives is never a bad idea. It's always a good to be able to say that, you know, we're going to prioritize our quality challenges based on those things that are most important to the organization, and they could be vision they could vision they could be mission. They could be the goals of the organization, leverage quality metrics collaborating communicate. Again, all these things are going to require people in a plan, and they're going to require effort. I mean, these things are not going to self rectify. Resource the action this might be one of the most the biggest hindrances of success within organizations is that they're not allocating people they're not allocating budget to improve data quality. Well, the question is, are we articulating to the people at the right level of the organization, what the data quality challenges are, and what it's preventing us from doing and how there's things that we would like to be able to do that we can't do and help them to you got to crack that wall and help them to understand how important this stuff is, because if they if there's not budget, you're not going to have people you're not going to have a plan and not going to have the effort. You know you need to resource the security and the technology tools, building a skill teams, all these things are really important again as steps actions that you can take to make the governing of data quality easier within your organization. So, the governor team to address the creating a governance team, establishing roles and responsibilities to implement the standards and the things that we defined earlier in this webinar to promote transparency. Again, don't want to read through all these things, but making the governing of data quality easier. Again, it's not going to happen on its own. And those organizations that have had the ability to demonstrate significant success, have teams to address the challenges or the opportunities, or at least they build teams specific to address opportunities and challenges along the way. You can measure and report the value of the actions to find the metrics established baselines. Again, these are actions that you can be taking right now to make the governing of data quality easier within your organization. So I just went through a series of maybe five different groups of steps that you can take actions that you can start taking right now. It all appears so simple. The question is, then why is it so difficult, why is governing data quality so darn hard. Well, there's a lot of factors that we need to consider within our organizations that there's the complexity of the landscape within the organization that there's silos and fragmentation of data that we've got varieties different types of data within the organization. So we've got to recognize that there's governance and quality control to all the different types of data in exactly the same way. So we got to recognize that there's silos, there's variety, and that the volume of data just keeps getting larger and larger. There's a lot of structured data in databases and tables and systems, but the statistics I'm reading is that the volume and the velocity of unstructured data is now, there's more unstructured data than structured data within the organization. So focusing on the quality of unstructured data is also going to require people and effort and time and all those types of things. There's organizational challenges. A lot of organizations have a lack of awareness at the appropriate levels of the organization, have the lack of the buy-in. These are organizational challenges. These are the reasons why all of those actions that I just talked about are so difficult because there's a lack of awareness, a lack of buy-in. There's competing priorities, like I mentioned earlier. There's resource constraints. I don't know of any organizations where there are people sitting around looking for things to do. That's one of the reasons why I feel non-invasive data governance is so strong because we can start to empower the people in our organization that we have rather than trying to ask for tons of new resources because in most organizations, those resources may never come. There's a lack of clear roles and responsibilities. As I said, I'm going to be talking in Orlando in March about how do we start engaging people? How do we define clear roles and responsibilities and start to empower and engage those people? Technology and tools, again, more reasons why it's so difficult, why governing the data quality is so difficult is because tools are outdated. There's integration challenges. There's privacy and security concerns. There's a lot of these things. Why is it so difficult? Because of cultural and behavioral factors. There's resistance to change within an organization. People in organizations can be very defensive if you come to them and you talk to them about changing their process to improve the quality. There is resistance to change in every organization, a lack of literacy, the fear of blame. There's so many of these. There's a continuous process. Data quality, just like data governance, is not a thing that has a beginning and a middle and an end. I hate to say it, but data governance programs, they're called programs because they're not projects. They don't have a beginning, a middle and an end. When you implement data governance, the idea is to govern the data forever. And so it's going to be a continual process. If and when you have a list of all your data quality problems and you solve them all, if you think that no more data quality problems are forthcoming, then you're kidding yourself. It is a continuous process for data quality. We need to be able to adapt to change and, and this is really important. We need to have a long term commitment from our organization. Because again, these things won't happen on their own. We need the resources. We need the effort. We need a plan for how we're going to address these things. Alright, so the last section that I have today in the webinar before I flip it back to Shannon in her 899th webinar since 2012. I want to talk about the ways to improve the business outcomes through governing data quality. And I'm just going to provide a quick definition for each of these items that are bulleted here to enhance decision making, increase operational efficiency. These are the ways that we as data governance and data quality practitioners can improve business outcomes. Like I said, through enhancing their ability to make decisions, increasing their efficiency, improving their satisfaction, regulatory compliance, all of these things. So let me just run through each of these 10 items quickly. And then I'm going to summarize with one slide that talks about how you can go get started today. And I'm guessing you're already started, but hopefully I gave you some ideas that you can grab on to and kind of move these forward within your organization. So just real quick definitions for enhanced, excuse me, enhanced decision making. It's a good thing my voice is going away now because this is my last webinar of the year, obviously. But enhanced decision making is certainly important. And I wanted to provide, like I said, a definition of what enhanced decision making means. What it means in a nutshell to increase operational deficiency, efficiency, not deficiency. Well, we're going to increase deficiency as well too, I guess, but it's really the, it streamlines the process to reduce our errors and create and get rid of those operational bottlenecks within your organization. And you could do that by analyzing, is the data on time, is the data, looking at the different dimensions and seeing, you know, can we focus on those dimensions to increase operational efficiency, improve customer satisfaction has to do with data quality, my friends, regulatory compliance, data quality aids in meeting the compliance, the cost reduction, trust and credibility. All these things are things that we can focus on to improve the outcomes through data quality within our organization. You know, effective risk management, business intelligence, innovation, enablement, competitive advantage, all these things are things that can result from improving quality in our organizations through applying formal governance for data quality within our organizations. So one of the last things I want to share with you here is how do we get started today. So some of the things that you might want to take back with you to your organizations is, you know, how do we go about building awareness and gaining buy in for data quality. How do we get people to realize the challenges that they're having and that they're not going to be able to address those challenges, unless we put the effort towards it and improving data quality, defining data quality to goals and priorities. I already mentioned defining the dimensions of data quality for your organization, conducting a quality assessment, implementing a data governance framework. One of my favorite topics these days has to do with the data governance framework that I'm putting together, but implement a framework, whatever that means to your organization, it may include roles and responsibilities. It may include processes, however you define a governance framework. Those are ways for you to get started today is define a governance framework and focus it if you need on data quality itself, foster data driven culture monitor and track. Again, all these bullets are things that you can go out and you can get started and you can get active on right now. And so hopefully the last 45 minutes have been helpful to you just to quickly summarize the things that I talked about today. First, I talked about what it means to govern anything before I started getting into what does it specifically mean to govern data quality. And then I talked about the dimensions of data quality and there's dimensions of a lot of different things basically different directions that you can take things. Those dimensions of data quality are directions that you can take as a data quality or a data governance practitioner and use those to demonstrate how data governance is adding value through quality based on the different dimensions. We talked about governing quality by dimension, making the governing of data quality easier. I suggested some some steps that you can take to do that. And then we talked about improving the outcomes through data quality. And unbelievably, Shannon, I took us right to 10 minutes to before the hour. And I'm going to flip it back to you to see if there are any questions from today's webinar. Holiday miracle, not just kidding. Thank you Bob so much for such a great webinar and in the hot webinar season on such a high note, lots of questions coming in and just to answer the most commonly asked question. I will be sending a follow up email by ended Tuesday for this webinar with links to the slides and links to the recording. So diving in here Bob, I'd like this really just a comment but want to see if you have anything to add to it. And to say data quality remediation is not just about fixing the data but fixing the problem of how the bad data got there in the first place. Wow, that's a that's a that's a great statement. And, you know, it's, if you have bad water in your house you're not going to go to each spigot and you're going to not going to fix the, the, the quality of the water coming through that specific spigot. You're going to address it from a whole house perspective right so yeah I mean trying to solve individual problems. I mean that's that's one way to be able to demonstrate value but we need to look at this more holistically for the organization. So I, I like the idea that you know we can, we can cover this with pinpoint solutions only so far. We need to develop a program and that program should be tightly related to our governance program within our organization. And Bob, data governance is people governance and perhaps we need to begin to see clients and users as equally willing partners. Again it's a statement more than a question I guess, but yeah I mean what Len said to me is was really important because you know I know Len is a human interaction type person, but he talked mostly about people's behavior, and we need to focus on people's behavior because the data is not going to govern itself. It's, as I mentioned before I don't know how many times it requires effort, it requires a plan. And as I said in order to be successful in achieving data quality or improving quality within an organization, it requires people to and they need to know what the right actions are to take and the people that are producing data on the one end. If you recall I talked about the finders producers and users, they've got to be held accountable for how they're producing the data. We've got to create systems that prevent them from entering data that is incorrect. One statistic I heard and it's very apt for today's discussion is that there's more people born in the world on 1231 99 than any other day. Why is that because there's systems that default to 1231 99 as the date that they were born. We need to get people even though we're providing defaults, we can control the quality of the data on the front end. And we need to engage people to do that. Great. So I do have a question for you, Bob. A real question. A real question. I love the comments to keep them coming so. But, okay, so how do I quantify the impact of poor data quality in an organization. How much time do they have? How much time do we have left on this webinar? You know what I'm going to direct you to other sources. We obviously don't have a time to have the time here to be able to focus on quantifying it. The resource I'm going to direct you to a gentleman by the name of Larry English had written about the cost of poor quality data to the organization. And I highlighted some of those things are not from him, but highlighted some of the things that I've experienced. That's the expenses associated with poor quality data. Boy, you know, you could focus on time you could focus on confidence you could there's things that you can measure I would certainly focus on the quantitative side of things. In order to say what, you know, ask people what they can't do, because the data stinks or the data is not suitable for their purpose, ask them what they would be able to do be able to measure those types of things quantify those types of things. One last thing I want to add to that is, I typically suggest that organizations don't focus on ROI specifically from data governance. I say focus on how we're going to measure success how we're going to quantify the business value to the organization by all the other initiatives in the organization that are really the ones where organizations are investing their money. Whether it's building a data platform or an analytical platform or master data management or building your data warehouse. If organizations are spending a lot of money there, they should recognize that they're not going to get the value out of those resources, unless the data is people trust the data and they have confidence in the data, and there's high quality and the data. And if you don't believe that to be a fact ask people within your organization that have have not demonstrated ROI from specific initiatives, and almost certainly they're going to point back at the quality of the data in the organization. So four minutes less I'm going to try and get to as many questions as I can hear Bob. Can you expand on regulatory issues with poor data quality. I thought it would fall under data compliance requirement under data governance. Well, I mean, if just I'll use as a simple example if data is missing. If it's incomplete, or it's missing or it's inaccurate, and you're trying to comply to regulations. There is a connection there I kind of anticipated that there would be a question about that the only problem is, and I didn't anticipate a great answer to the question. There needs to be that connection that is made. Regulatory compliance has a lot to do with people understanding the rules associated with the data understanding the classifications, how that relates to how they handle the data that may not may or may not be directly associated with data quality. If you don't have the information that you can provide to the people that are using that data. The chances are that they may break compliance, they may not follow the regulatory management that's being imposed upon the organization. So I think there is a direct connection there. I'm glad you called it out. I'd be curious as to what other people think to so please provide that in the chat if you've got thoughts on it. Any questions coming because we will get answers to the questions that are outstanding in the follow up email. So, are there any ideal tools to assist in the data quality initiatives to support the data stores, etc. I have these folks attended the data management days from data diversity to see what tools are available to them. I suggest that you do that because there are tools that are specifically focused I don't want to name any specific vendors. Certainly to improve the quality of the definition there's all the data catalog tools. There's a lot of tools that you can use to profile your data. There's tools that are parts of sweets that provide both the catalogs and the profiling capabilities. So are there tools? There's a lot of tools. I would certainly suggest looking to some of the presentations that are made by some of the vendors in the data days or the I'm sorry in the data management days. And in the sessions that data diversity holds throughout the year because I would almost venture to say that there is a data quality component in each and every one of those tools. And yes, you're referring to the demo days and we have two days next year. Yeah, coming up next year on data quality. So, yeah, there's a lot of good options. So. I love it. So, I got just a couple minutes left, but I'm going to try to slip in one more. I'd love to hear your take on data observability and AI based on based anomaly detection and synthetic data to resolve data quality issue. Shannon, we can't make it through a webinar or a presentation or anything without discussing AI, can we? No, it's such a hot topic. It's so hot. Yeah, it is. And so repeat the end of that question again because there was a term that they used that I wanted to kind of jump on. AI based anomaly detection and synthetic data to resolve. Synthetic data. So the question is, how synthetic is it? If it's being based on, yeah, the same rules apply for the data that's being produced coming out of your large language models and your AI. But typically the same quality needs to be checked for that data as for the data that goes into the large language models. So I would say that you need to focus your quality efforts on, you know, I wrote a chapter of this in my second book about what are some of the data governance challenges that are associated with large language models and artificial intelligence. And you'll see that a lot of those data quality challenges, data management challenges. It's just a new front end to it that's creating this synthetic data based on these large language models. I would say it adds complexity by the fact that you've got to govern the data going in and you've got to govern the data coming out. Perfect Bob. Well, thank you so much for this great webinar and so much fun ending the 2023 webinar series and season. Looking forward to 2024. Again, just a reminder, I will send a follow up email to everybody with links to the slides and links to the recording of this webinar by end of day Tuesday. So I hope you all have a happy holiday season. Happy New Year, everybody. And happy holidays to you Shannon and happy New Year to you as well and everybody out there looking forward to seeing you on the flip side 2024. See you then. See you soon. Thanks all.