 And welcome, my name is Shannon Kemp and I'm the Chief Digital Manager at Data Diversity. We'd like to thank you for joining the current installment of the Monthly Data Diversity Webinar Series, Real World Data Governance with Bob Siner. Today, Bob will discuss using data governance to achieve data quality. Just a couple of points to get us started. Due to the large number of people that attend these sessions, he will be muted during the webinar. If you'd like to chat with us or with each other, we certainly encourage you to do so. Just click the chat icon in the upper hand corner for that feature. For questions, we'll be collecting them by the Q&A in the bottom right-hand corner of your screen. Or if you'd like to tweet, we encourage you to share highlights or questions by a Twitter using hashtag RWDG, Real World Data Governance. As always, we will send a follow-up email within two business days containing links to the slides, the recording of the session and additional information requested throughout the webinar. Now let me introduce to you our speaker for today, Bob Siner. Bob is the President and Principal of KIK Consulting and Educational Services and the publisher of the data administration newsletter, TDAN.com. Bob has been a recipient of the Damon Professional Award for significant and demonstrable contributions to the data management industry. Bob specializes in non-invasive data governance, data stewardship, and metadata management solutions. And with that, I will give the floor to Bob to get today's webinar started. Hello and welcome. Hi, Shannon. Hi, everybody. As I say always, thank you for taking time out of your busy schedules to sit in on a real-world data governance webinar. Today, as Shannon mentioned, we're going to be talking about using data governance to achieve data quality. Oftentimes, I get questions about the webinars and people ask me about the subjects that are coming up. And I've gotten a couple suggestions from people to talk about specific ways in which data governance applies to the real world. This is real-world data governance. It applies to the real world within your organization. And we're going to talk today about using data governance to achieve data quality. And I can tell you from firsthand experience that a lot of the organizations that I'm working with are focusing right now on data quality. So you might be one of those organizations that's focusing on those things. Before I get started, I always like to share a couple of the things that I'm working on and things that I'm involved with. I think if I added one more item, I couldn't fit them all on one page. But now I wanted to point out that the next couple of subjects that we'll be addressing in the real-world data governance series are going to be using data governance to protect sensitive data. Again, another real-world example of how organizations are applying data governance. And in June, we'll be talking about using data governance to improve data understanding. Again, a real-world example of how organizations are using data governance and demonstrating value in their data governance programs. I always mention the non-invasive data governance book. I don't talk too much about non-invasive data governance during the webinar, but if you're interested in learning more, there's a couple of places that you can get that information. One is through the book that's available through your favorite booksellers. And another way is through the online learning plan that's available through the Data Diversity Training Center. So there's a non-invasive data governance learning plan. Just to point out to you, I will be speaking at several data diversity events coming up. In fact, the first one is actually next week, Enterprise Data World in San Diego. So I'll be speaking on building and using maturity models, data governance maturity models in that session. In June, I'll be back in San Diego speaking at the Data Governance and Information Quality Conference, which is a data diversity and depth tech international event. And then just this week, the agenda was announced for the Data Architecture Summit in Chicago. And I'll be giving a couple of presentations at that event too. So I hope to see you at these events. Please stop by, stop me if you see me, say hello. Love to meet you in person at one of these events. As Sharon mentioned, I'm also the publisher of the Data Administration newsletter. A new issue was published yesterday, twice monthly, on the first and third Wednesdays of the month. So please check out tdan.com if you haven't before. And last but not least is KIK Consulting and Educational Services, which I describe as the home of non-invasive data governance. So if you're looking for more information, please visit kikconsulting.com. And there's all you could ever want to know about non-invasive data governance at that site. In this webinar, I'm going to be discussing five specific topics about using data governance to achieve data quality. The first one is we're going to talk about defining first data governance and then we're going to talk about defining data governance in terms of data quality. And I'll give you some examples of how some organizations have done that. We'll talk about delivering roles for data governance specifically for improving data quality. Then we'll talk about selecting appropriate data quality processes to govern using working groups within your organization to focus on these things. And last but not least, we're going to talk about measuring quality to demonstrate data governance, demonstrate governance performance within your organization. So the first topic that I said I was going to talk about was the definition of data governance. And I often share the definition that I use for data governance in my webinars. The other fact is that a lot of organizations look at my definitions and they cringe a little bit because they're worded somewhat strongly and there's a reason for that. And organizations that focus on really executing and enforcing authority through their data governance programs, that's really what data governance is all about. So I talk about data governance as being the execution and enforcement of authority and in the case of focusing on achieving data quality in your organization, we're going to be executing and enforcing authority to make sure that we're improving the quality and improving the value of the data within our organization. So I like to word it strongly, although some organizations look at it and say it's a little bit too strong and they temper it a little bit and I'll give you an example of that in one second. My definition of data stewardship is that it's the formalization of accountability over the management of data. And the reason why I say that is that if you've attended my webinars in the past, you might have attended the one where I say that everybody in the organization is a data steward. Anybody that has relationships to the data really need to be held formally accountable for those relationships to the data. So that's why I say stewardship is really the formalization of that accountability to the point that anybody who defines or produces or uses data as part of their job should be held accountable for how they define, produce and use data. My definition of noninvasive data governance is it's actually the practice of applying that formal accountability behavior. You know, and using noninvasive roles and responsibilities, which I'm going to share with you during this webinar, at least a model of what I consider to be the most important roles and responsibilities. In the noninvasive approach, really the idea is to apply governance to existing or new processes and not redefine all your processes for data governance. And so I have a pet peeve that I mention often is that I don't like the calling things data governance processes because it points at data governance and says that's the reason why we're doing these things. So rather than defining all these new processes and calling them data governance processes, the idea of being noninvasive is to apply them to existing processes or to new processes as they're being developed and is to assure the definition production and usage, as I said before, assures whatever it is that you're focusing your data governance program on. So it could be regulatory compliance, security, privacy, quality protection, any of those things that you're focusing your data governance program on. And really noninvasive describes how governance is going to be applied to make sure that it's really non-threatening, although we know data governance can be a culture changing event. We're going to try to do this in a way that's non-threatening to people. We're going to identify who's doing what with the data, recognize people for that, and label them as data stewards. Really, the overall goal of being noninvasive in your approach is to be transparent, supportive, and collaborative. And as I mentioned earlier, there's a lot of information out there about noninvasive data governance. If you're looking for more information, please reach out to me. As I told you, oftentimes the definition that I use, it looks like it's worded a little bit strongly, or maybe for some organizations it's worded a lot strongly, and they like to temper it a little bit. And so here's an example of how a client took my definitions of data governance and stewardship and put them together to make sense for their organization. And they talked about data governance as being the formalization of accountability and repeatable behavior for managing data as it progresses through the data lifecycle, basically to support different things in your organization. And in the example of defining data governance in terms of data quality, it could be to support data quality initiatives, such as the ones that are listed on your screen. It could be for MDM, master data management, for analytics. A lot of organizations are looking to improve their analytical capabilities, and they know that they have to have good data in order to do that. So focusing on improving the quality of the data around analytics, information asset management to respond to an audit. I've had a lot of clients that have told me that they were audited and they've been told that they need to focus on putting more formal governance in place. So all of these things are different initiatives that might be associated with improving the overall quality of data within your organization. Other types of data quality initiatives, although they might not look on the surface as being about data quality, certainly the security and the protection of sensitive data, and that's what we're going to be talking about in next month's webinar, is using governance to focus on those things. So if you've got PHI, personal health information, or personally identifiable information, PII, a lot of organizations focus on improving the quality of their ability to be able to protect the sensitive data by documenting the business rules, the user rules, the sharing rules around data. All the other items there is regulatory compliance, business intelligence, data warehousing, and data lakes or data swamps, as some people are calling them, data integration. Whatever your data quality initiatives are focusing on, you can take your definition of data governance and you can change it so that you're really defining the point of data governance and what you're trying to do, how you're trying to improve quality through your data governance program. Some people have shared with me that when you define something, you should do it in a smart way. So basically, the smart acronym is specific, measurable, achievable, results-driven, and time-bound. What I always suggest is that if you give your definition of data governance some teeth, meaning give it something that people might react to, oftentimes you're going to get a reaction, like I said, from those people, and they're going to give you opportunity to describe how governance can be non-invasive and why it's very important to make sure that we formalize accountability and that we execute and enforce authority over the management of data. So it's really important to not only have your definition of data governance something that you can explain well to the organization, have it measurable, but also give it some teeth, give it people some reason to respond, sit forward in their chair, and really listen to you about what data governance is and how it can be applied at your organization. So I had mentioned five different subjects I'm going to talk about. The first one, talking about the definition of data governance and using it, applying it to achieving data quality, the second one was delivering roles appropriate for improving data quality. And the truth is, if you really want to know why some of these roles are important in improving data quality, well the truth is that things don't get done unless somebody is typically made formally accountable for getting them done. So somebody in an organization has to be formally accountable for recording data quality issues or collecting data quality issues and understanding how those issues are impacting the business. So we need to make certain that there's people within the organization, maybe it's your data stewards that would have the responsibility for recording data quality issues and then logging them through whatever format you give for them to be able to record them. So that being said, obviously we need somebody to record these data quality issues but we also need to have somebody on the receiving end and on the processing end of those data quality issues. So we need to make certain that there's a role associated with that in our program that somebody may be your data governance team, your data governance office that has that responsibility for receiving and then accepting and then processing your data quality issues. Certainly somebody needs to be formally accountable for taking your data quality issues and prioritizing them in such a way that resources can be assigned and that certain issues can be addressed in the organization. So we need to make certain that we have a role associated with prioritizing the quality issues and then we also have to have a role that's necessary to decide which issues are going to receive resources. These days a lot of organizations are pretty resource strapped and they're not necessarily have an overabundance of resources so when resources are applied to data quality issues somebody needs to make that decision that we're going to apply these resources to solving the problems. So just to expand these four different things that I just mentioned and if somebody needs to have the formal accountability to record the issues that's typically the data stewards. So people who are defining and producing and using data as part of their job basically as I've said often before anybody or everybody in the organization potentially could be a data steward so we want to make sure that we give them the ability and the knowledge that there's a way for them to be able to record and report data quality issues. Somebody needs to have that formal accountability to receive the issues as I mentioned before oftentimes it's a data governance office or a data governance team or a data governance manager but somebody has to have that responsibility for receiving them and picking them through a process to make sure that they get prioritized and that being said we need to have somebody in the organization that has the formal accountability to prioritize those issues and typically it wouldn't be the data governance office if they're just the facilitators of the program we want to have what I've been known to call data domain stewards or enterprise data stewards or subject matter experts around data to help us to understand what the impact of the data quality issue is and to prioritize them accordingly and I'll talk a little bit more about that in the upcoming slides and then ultimately somebody needs to have the accountability to decide which issues are going to be resolved and who will work on those issues and that would be typically the data governance council or some similar named body within your organization. I'm going to share with you here real quickly and it's not going to be the subject of this webinar but there is a webinar later in the year that's going to talk about this non-invasive data governance framework. Really and the reason why I've kind of crossed out the non-invasive data governance portion is that this governance framework could be applied in any approach that you use to data governance. I have written a white paper about the framework. If you're interested I'll share it with Shannon and she'll make sure that we have access to it through the email that she sends out. But the first component of this framework is a set of roles and responsibilities. It's important, actually it's the first foundational component. In fact, a lot of organizations that are getting started with data governance, they focus first on defining the roles and responsibilities around their program and oftentimes the way that you go about defining those roles and how they operate and how you identify or recognize people to participate in these roles, they're predictors of the effort that's going to be required to govern the data. So if you take a non-invasive approach, assuming that you need to formalize people's responsibility based on their relationship to the data, then that might be easier than if you're going to assign people who are already busy 150% of the time to be things beyond what they already think they are. So really the approach that you take and the way you define your roles is going to be significant predictor of the effort that's going to be required to govern your data. And the roles are represented in an operating model I'll share in a second. The pyramid is represented in the first column of the framework. Basically the operating model describes all those different roles that I mentioned a second ago and some more roles. In fact, let me show you the framework right now. This is the overall framework for non-invasive data governance. But what I want to do is I really want to focus on that first column. See if I can get that to highlight. Here we go. Overlaid to where it said non-invasive data governance framework, but it's just a data governance framework that could be used for any organization. And those core components across the top are really important. We need to define the roles. We need to find the processes, communications, metrics, and tools. And we want to look at it similar to how John Zachman did in the Zachman framework from a bunch of different perspectives and different levels within the organization. So we need to have executive roles, strategic tactical. All of those roles are going to be important in how you set up your data governance program. Governance roles that are going to be necessary to help you to improve data quality in your organization. And so the data governance office, as we mentioned before, those are typically the facilitators of the program. Those are the people that are communicating about data governance that are facilitating working sessions and working groups, which I'm going to talk about here in a minute as well. The data governance office plays a vital role. The steering committee, which is typically the executive level, is the one that gives the data governance office and the data governance council the ability to do what they do. So I've always said that senior leadership must support sponsor and understand what the heck it is we're doing with data governance. And so if they play a role, they may not play an active role in day-to-day improving the data quality, but without their support sponsorship and understanding, the data governance program becomes at risk immediately. So the data governance council, again, helps you to prioritize and helps to identify resources, the enterprise data stewards being the subject matter experts. We're going to get them involved in the data quality issues associated with data and their domains. There's something that I've added to this group that I don't usually talk about, and I think I'm going to start talking about it moving forward, is the use of these working groups of people that kind of cross the tactical and the operational levels of the operating model. And so creating these working groups, I'm going to talk a little bit about what it takes to create the working groups, what are some of the things that you need to know in order to be successful with your working groups focused on data quality. The data stewards obviously need to be involved when we need their information, need information about issues that they're having, and the data governance partners. It can basically be anybody else in the organization, from your legal team to project management, communications, supply chain marketing, any of those areas that have an interest, even HR, that have an interest in data governance could be supporters of the data governance office and the activities that they're taking. So on this slide, I'm showing that operating model of roles and responsibilities, and I think if you go back and you look at the existing real-world data governance webinars from over the years, there have been multitude of them focused on roles and responsibilities. Again, I'm not going to spend the time going through this in a lot of detail, but we want to make sure that as a core component of our program, we're defining the different types of roles within the organization that are going to be associated with data governance. And the first reaction that people often give me to seeing this diagram is that there's too many levels and that they need to be cut back. And the fact is that many people think that this is very bureaucratic setup, but it's really not because it really mimics how a lot of organizations look to begin with. There's an executive level, a strategic level, tactical, operational and support levels within the organization. So again, I'm not going to describe this in detail if you want to see more information about it. Please visit a previous webinar or go to tdan.com. There's a lot of articles that I've written on the different roles and responsibilities associated with putting a governance program into place. There's another diagram that I show often when I'm showing this diagram, and that's the thing that I call the common data matrix, and some of you may have heard of it. It's a way of being able to inventory the different types of data and the different subject areas of data within the organization or sub-subject areas, or should I say domains and sub-domains, and who the subject matter experts are, the data domain stewards are for that data, you know, who are the stewards within the business areas that define, produce and use that data, and then things like the data governance partners, which IT is one of the biggest partners, typically of data governance. If your program doesn't reside in IT, certainly you're going to be two in a box working with the folks in IT. So you want to know where the data from each of those domains exist in different systems, and if you can use a spreadsheet like this or a common data matrix like this to record who the people are that define, produce and use data and who the subject matter experts are around the different domains or sub-domains of data, you're going to go a great way towards filling the appropriate roles associated with data governance, and you're going to be able to use those roles and engage the appropriate people that are necessary to solve some of your data quality issues. So those two tools kind of go hand in hand, and when Shannon sends out her follow-up email, there's links to those pictures if you can't just get it from the slide deck, and there will be information about those things in the follow-up email as well. So that's the next thing that we needed to do is talk about the different roles and responsibilities and how they might be engaged in your data quality initiatives. So the next thing I wanted to talk about was selecting the appropriate processes associated with data quality, the appropriate processes to govern. And I'm just using here as an example, and there may be several more than what I'm listing on the screen here, but I want to mention what are some of the data quality processes that you might want to consider governing. And so I mentioned earlier the data quality issues submission and resolution process is something that's really important to the organization. The data quality issue follow-up and reporting, the data quality issue prioritization and putting a cost to what it's going to cost to resolve a specific data quality issue. Data quality certification, whether it's for your applications or your data warehouse or your data marks or your data lake, making certain that the data follows certain rules before it gets loaded to that data resource. And then metadata management is another process that you might want to apply governance to to improve the understanding and the quality and the knowledge about the data within your organization. So there's a handful of different data quality processes that you might want to consider governing within your organization. So let's go through each of those real quickly. The data quality issue submission and resolution, well, what I've suggested to many organizations that I've worked with is that we want to make certain that data governance follows best practice. And in order to get data governance to follow best practice, we want to make sure that we have certain processes around different things. And in this example, we want to have a formal and a repeatable process around submitting data quality issues. So what goes in, you may ask, what goes into making something a formal and a repeatable process? Well, I would suggest that you would create at least three different types of artifacts that would be associated with formalizing repeatable processes within your organization. The first one would be a data and a workflow diagram where that process is going to work. Oftentimes organizations create swim lane diagrams that say for each step who has the responsibility for those things. And they go along with the racy matrix. You may be familiar with racy matrices, which basically cross-reference the different steps of a process with the different roles associated with your data governance program, perhaps the ones that we just spoke about. But for each step, we identify who's responsible, accountable, who we need to consult with, and who's going to be informed about these steps of the process. And the last thing is, in order to make something really a repeatable or a formal process, we need to provide a detailed description of what goes on during that step of the process. Another thing that you might want to consider around this is to have formal training for the people in your organization who are identified as the data stewards. Keep them that there is a process for submitting issues. Keep them what the process is, provide to them a data quality issue, submission form, or something like that. But we need to train them. First of all, we need people to realize that since they have a relationship to the data, they have some level of accountability for that relationship. And if they see something, say something, or if they see an issue, then they're basically, these are the people of the organization that are the eyes and ears within your organization. So during your formal training, you want people to walk out of that training understanding that they really have a level of accountability for reporting data quality issues and following the process that you've defined for them. We also need to make certain that we have a cadence of communications and awareness with the data stewards to let them know that they are very important to the improvement of data quality within the organization. And again, somebody has to have the responsibility for carrying out that cadence of communications and awareness, and oftentimes that's the data governance office, the business analysts that are working with the data governance office. And then somebody also has to have the responsibility for formally reporting data quality issues, how many have been collected, how many have been closed, how many have been worked on, how many have been prioritized, and those types of things. So we want to make sure that we have a formal method for reporting data quality issues to the people in the organization that want to see the value that data governance is adding to the organization. Another process to govern would be the data quality issue follow-up. And again, some of these things might seem to be things that would be obvious, things that you want to do, like having an immediate email thanking some of the stewards for submitting issues, you know, having a response or following up with the stewards to clarify understanding of what they have recorded as being a quality issue and all the emails that need to go to them. Once the issue has been prioritized or once the council has made a decision about if they're going to apply resources or even about the application, about getting people who have been assigned to address these data quality issues, you want to make certain that the data stewards are getting feedback and that they understand. So all of these things seem pretty obvious as things that we know that we need to do. But again, the fact is that there has to be a role within your data governance program to make sure that these things get done. So oftentimes, again, as I mentioned before, that that's the data governance office that has that responsibility. For data quality issue reporting, as I mentioned, there's different status that you can report for the different data quality issues. The issues that are open and collected, closed, and why were they closed is the issue no longer an issue. Was it really just a question rather than an issue? How many issues have been reviewed or prioritized and which ones have had resources assigned to them? And people want to know when their issue is going to be addressed or the status of the issue and that really helps to encourage the data stewards to get more actively involved in submitting data quality issues. Prioritization. So a lot of organizations will take the data quality issues that they have and prioritize them. And here's how one organization I've worked with has done it. They basically created three categories of critical data quality issues, medium and low priority issues. Typically, a critical issue would be an issue that has resources that are already assigned or that will be assigned to solve the problem. Medium determination, as far as prioritization is concerned, would be that you're considering the assignment of resources that you haven't necessarily assigned resources at that time. And if it's low, oftentimes it's worth keeping on the list, but the assignment of resources is not necessarily being considered yet. So again, that's another of these processes, data quality processes, that we need to make certain we govern in order of our governance initiative to focus on data quality. Data quality issue costing, well, that's a difficult issue for a lot of organizations. How can we identify what it's going to cost to solve a problem? And oftentimes I suggest that we look at it in the opposite way as well, as what's it going to cost us to not solve the quality issue? And here's some examples that I've used repeatedly within organizations, whether this data quality issue is causing us to lose revenue or miss specific opportunities, whether we're going to lose business because of it, or we've got inefficient and ineffective processes that are unnecessary and it's costing us too much to do things. Whether a data quality issue could lead to some catastrophe within the organization or just increase the risk associated with the data in your organization, or maybe the data quality issue is making it such that we can't share data between processes in different parts of the organization. So when we are assigning governance or we are focusing on applying governance to this, we want to make sure that we can answer the question when it comes to it that what's it going to cost for us to resolve this issue and what's it costing us to have that as a quality issue that we're working on or that we're not working on presently within our organization? Data quality certification. I mentioned the data quality certification is typically focused on data that's feeding the data warehouse or your BI environment, your data march, your data lakes. So you might want to be able to answer these questions when you're going through the data quality certification process. Is the data accessible to people? Is the data accurate? And oftentimes it's hard to determine whether the data is accurate unless there are some standards that are created around the data. And I'll talk about that in a minute as well, which is important to have groups of people that are working on developing data quality standards for your data. Is the data complete? Does it conform to the standard? Is it consistent timely? All those things that we see as typical dimensions of data quality, we want to make sure that the data that makes it into these resources that people are going to count on is all of these things, accessible, accurate, complete, conforms and all of those types of things. So look at typical layouts of data quality dimensions and use those when you're focusing on improving your data quality certification process for data that's making it into these types of resources. Metadata management, as we all know metadata is data about data. It's also data about the definition and the production and the use of the data. So anything that helps to answer the questions of definition, production and use are things that we should consider as part of our metadata processes and maybe it's something to collect within our data governance or our data dictionaries and glossaries and catalogs and those types of things. But the one thing that people don't mention a lot is that some of the metadata also includes stewardship data. So everything that was collected in the common data matrix that I shared earlier is stewardship data. It's data about who has formal accountability for the different aspects of data and data quality within the organization. And the truth is when it comes to metadata that somebody has to have the responsibility for that, for collecting the metadata, validating the metadata, for making the metadata available to people that don't use it. I'm going to be talking about metadata governance at the data architecture summit later this year. But that's a very important subject for a lot of people because, again, the metadata won't manage itself. The metadata won't manage itself. It really requires that somebody in your program has the responsibility for collecting, validating and making that metadata available. And here's some examples of, client examples of data quality activities. And you can see they really cross a whole lot of different areas. There's the development of requirements definitions. There's standard definition. There's quality assessment, usage, profile and cleansing. All of these different things are activities within the organization that we know that we need, in order for them to be done, we need to have resources that have the responsibility for doing these things. So again, I know I'm going through some of these slides relatively quickly, but I hope that when you go back through these slides later, that some of these lists are helpful to you to see what other organizations have done around improving the quality of their data is really the way to do that. So you might want to also talk about creating working groups within your organization to focus on data quality projects. And one thing that we need to do is we need to know where data quality problems often occur. And if we can target those different places or those different times and we can apply resources to collect issues around the data, that's going to be very beneficial to our program. So typically the data stewards, the people that define, produce, and use data in the organization, potentially everybody, that when they recognize that there's a problem with the data that they're using for their everyday job, that when they see that type of problem, that they have it out. They have a way of being able to report that as an issue. And so first of all, I'd go to the stewards and say, what do you recognize? I'm going to prioritize them so that we can address them in the appropriate manner. Oftentimes data quality problems occur when data is being requested from somebody to somebody else in the organization, when it's being merged or integrated for business purposes. Oftentimes data quality issues take place when data is being created as part of a new or revolving or an evolving process within your organization. Or even when data is acquired in your organization from an outside source. We would need to make certain that the data that we're bringing in is of high quality and that it matches the standards and the business requirements that we have for that data within our organization. We also need to recognize the need for the working groups to solve a business problem. So here's some examples of what I would consider to be a data quality problem. That the data is not consistent or it's incorrect. There's people that have the ability to change specific data that shouldn't really have their hands in changing the data or the data is not locked down the way that it needs to be and that there's people that have the ability to change data to make it inconsistent or to make it not follow the standards we've set up within our organization. There's also oftentimes known and documented data issues that data is not linked across systems. It should be eliminated. There's too much data or there's old data that can be retired. Hierarchies of data within organizations, I know this quite a bit, are not synchronized or that there's institutional knowledge about how the data is used that's not being documented appropriately. So all of these are potential needs for setting up working groups to solve a specific business problem associated with your data. So again, I know this is a... I went through that list kind of quickly, but those are things that I have seen from my experience as being times that we can recognize that, hey, maybe we can't solve this problem by ourselves. It would make sense for us to bring in or build a group of people associated with solving the issue. So the next subject I want to talk about is using these working groups to focus on the data quality project. So on the next couple of slides, I'm just providing a couple of things that are, again, might be a common sense to people, but steps to convene a working group to develop or to deliver or to improve data quality. So first thing that we need to do is recognize and communicate who in the organization needs to be involved in this working group. And we need to be able to specify to them they're not going to volunteer unless they're volunteer told to be a participant in the working group, but we need to have them understand what we're going to need from them. So what's going to be the ask of the people in these working groups? Estimate and share how much of their time is going to be expected. And certainly develop a plan approach to how you're going to achieve the results in improving the data quality. So, again, a lot of these things might seem like common sense, but when you're convening a working group, it would make sense to be considering these things as you're starting to bring that group together. There's also steps to gain approval of the working group. So oftentimes people are not going to be allowed to participate or be able to participate in the working groups unless there's been some level of approval for them to do so. So we need to identify early on who will need to approve the fact that this working group is going to be convened and that it's going to focus on solving specific data quality issues. We need to formalize this across either if it's within one single department of your organization or if the data quality issue is going to impact people across different parts of the organization. We need to formalize the different departments' involvement in the data quality effort. And we need to gain the approval of all those different departments as we start to move forward with the usage of the people within the working group. And finally, some steps to get the data quality effort. Well, these are things that, again, might seem to be somewhat obvious the things that you might want to consider when you're creating the working group is determine who's going to be the person that's responsible for the effort. And set agendas for meetings, which is very difficult to do, especially with some of the organizations I've worked with that a lot of people are just up to their eyeballs in meetings. So we need to schedule the meetings. We need to set an agenda for the meetings and what the meetings are going to focus on. So again, somebody has to have the responsibility for doing these things. Scheduling people's time, arranging meeting logistics, whether it's going to be at a certain place or whether it's going to be through web meetings or WebEx or any of the different tools, Skype, that are available to organizations. Somebody needs to take care of arranging the meeting logistics. And they need to, again, there needs to be somebody who distributes the invitations and tracks the responses and that the people that have been approved and that have been willing or have stated that they're willing to participate in the efforts are going to be able to attend the meetings as necessary to help us to use the working group to resolve the data quality issue. So let's also spend a minute talking about the methods that are directed at improving data quality. Let's talk about those three primary actions, the definition, the production, and the usage of data. Data working groups, or data quality working groups focused on improving the data definition and the standards, improving how the data is being produced and collected, how the data is being used, and getting people to improve their understanding of the business rules associated with that data. Improving quality through data documentation and metadata, I want to spend a brief second on each of these as well before we wrap up here today. So the first one was improving data definition and standards. So you might consider having these working groups focus on efforts to build and approve and maintain and to share a business glossary of terms. So a lot of organizations are building business glossaries which are related to the data, but they're not really about the data. They're the terminology that we use within our organization. So when we talk about somebody as being a customer or a client or a student or something like that, we have a shared understanding of what these terms mean to our business. There's also efforts to build, approve, maintain and share things like data dictionaries, a critical data element, and typically data dictionaries, from my experience, have been associated with a specific data store. So organizations can have lots of different data dictionaries in lots of different ways. That's not the ideal approach. The ideal approach would be to have a centralized data dictionary, but since a lot of these organizations have an acquisition and through merging and things like that, they may not have a single centralized place for recording their data dictionary definitions of those critical data elements. And the efforts to define standards for the critical data element, but again, that's another use of the working groups to solve problems, to improve the quality of data within our organization. And by improving the data definition and the standards, we can look to improve our data production. To understand how the data is going to be used and the systems are locked down to collect and receive quality data, it's going to improve the data production. It's going to improve data usage if people understand the data and understand the rules associated with using that data. And it also certainly, if we improve data definitions and standards, it helps to improve, we're going to see improved data shareability in the organization, in the department across departments, across some city areas, however you share data within your organization. So let's focus on, you know, what's the result? So what types of efforts can we put in place to improve data production and collection? We can have efforts to assure that data entry meets the standards that have been defined for the data. The efforts to assure that the data acquired from external resources are matched up with, again, the standards and the data requirements that we have within our organization. And make sure that we can even see organizations that have had working groups that focus on specific information systems and making certain that the data they collect within those systems is of high quality. Oftentimes things like free form fields make it very difficult to follow standards for data across your organization. And so what are the results of improving data production and collection? What we're going to see improved quality from all those perspectives from the accessibility, the accuracy, completeness. We're going to improve data quality. We're going to improve data usage if the data, if people trust and have confidence in the data. So that's going to be a result of improving data production and collection. And it's also going to improve our ability to share data either within the department or across departments within our organization. Another way to use the working groups is to define efforts to assure that data entry, this is for data usage. I meant to update these, but to improve data usage, I think this actually shows what was on the previous slide. So I'll update those. But efforts for people to understand the data better, to understand the rules associated with how they should define, produce and use data, how they share data. We want people to understand what data needs to be protected by the organization and the handling rules associated with data that's classified a specific way. And one of the things that we should see as a result of improved data usage and understanding of the business rules is that we're going to have improved data compliance. We're going to have improved data protection and improved understanding, which will lead the people who are using the data to make better decisions. Again, the better they understand the data, the better they're going to be able to use that to make important decisions about the organization. Improving data quality through the documentation and metadata, I see a lot of organizations that create these working groups to focus on building those business glossaries and those dictionaries and the data catalogs that can record any type of information about the data that you so choose. So I've heard the term data catalog used within a lot of organizations. There's not necessarily a specific way that organizations can combine what a catalog is, but oftentimes it's glossaries, dictionaries, and then these catalogs of reports and available data and those types of things that are important. And the creation of these working groups can often help us to go a great way towards assuring that these things are complete and they're usable within the organization. And to use these groups to assure that data definitions meet business requirements, that the information resources that's one of the things that I hear as a biggest complaint in a lot of organizations is that they feel like they've got the data but they don't know where it is. They don't know how it's defined. They don't know how it's used, how it can be used. So we can improve the definition, production and usage of the data through improving the documentation and the metadata. We can improve our capabilities that are required in order to use data as a cross-functional asset. We can improve the understanding of much of the information resources. So there's a bunch of different ways that we can leverage these things that I call working groups to help us to achieve data quality within our organization. So the last thing that I want to talk about before I turn it back over to Shange for questions and some answers is I want to talk about measuring the quality to demonstrate governance performance. So a lot of organizations use different dimensions of data quality which are basically the magic tips of the aspect of data quality that we're addressing and we're measuring. And they want to answer the question so what do we consider to be quality data within our organization? Quality data is the data that's trusted and fit for intended purposes, whether those purposes are operational in nature, whether they're used for decision making and planning to define for your organization what you mean by quality data and then target your working groups on improving those aspects of the quality that are going to help you to reach some of the goals that you've set up for your data governance program. The data quality processes control the usage of data and data quality improves when people focus on making sure that they're improving how the data is defined, produced and used so that's pretty simply put but those are the things that we need to focus on as organizations to achieve data quality. If we can improve how data is defined that's going to lead to better data production and that's going to lead to better data usage so a lot of it starts with the definition of the data. Some organizations start by focusing on data modeling as their way to provide the kind of definition that they need for their data. Last thing I'm really sharing is well what are some of those dimensions that organizations use to describe the different aspects of data quality? There's the accessibility which is the ability for people to get to the data that they need. There's the required data, there's the accuracy making sure that the criteria is related to similarity of the original intent of the data, the completeness of the data, the conformity, the consistency. These are all aspects of the data. There's the adjectives that I mentioned that are the data quality dimensions that are to be defined across the organization. There's the integrity of the data, the timeliness, the uniqueness, and the validity of the data across your organization. When you're looking to measure the quality to demonstrate that you're improving the performance of the data through your governance program then having data quality working groups that are focusing on improving all of these different dimensions or improving a specific dimension. For example, I've seen a lot of organizations that are focusing on the accessibility of the data. Do people have access to the data that they need to perform their job function? Well, you can measure that and then you can measure the improvement of that, so oftentimes it makes sense to define kind of a benchmark when you're getting started, but then to also identify how we've improved how the organization has functioned in association with each of these different dimensions associated with data quality. A relatively short period of time of these lists of things that I've provided will be of benefit to you. We talked about using a definition of data governance, but then focusing in terms of data quality for your organization. We talked about delivering different roles appropriate for improving quality, selecting appropriate processes to govern, using these things that I'm calling working groups, and you may call them the same within your organization. I guess I'm not too original to focus on data quality projects and then measuring quality to demonstrate governance performance across your organization. So those are the things that we talked about today. What I'd like to do is I'd like to turn it back over to Shannon and see if we've had any questions from today's webinar. Bob, a lot of great questions coming in already. Of course, I'm going to answer the most commonly asked questions. Just a reminder, I will send a follow-up email by end of day Monday for this webinar with links to the slides and anything else requested throughout. That includes the framework that you were mentioning. So what is metadata management in a data quality process? What is metadata management? We know what metadata is. We know that metadata is the data about the data. But you've got to take a look at how the metadata is going to improve the quality of the data. So when it comes to definition of the data, you want to make sure that people... that it's defined in a way... I always use the term... don't use cheeseburger definitions. Don't say that a cheeseburger is a burger with cheese. Don't say that a patient account is an account of a patient. But use business terminology so that people improve the understanding of the data. And the standards for what the data should look like. And you can use those definitions and those standards to measure the existing quality of the data and then to measure the improvements of the quality of the data. So if only half of your critical data elements are defined properly or defined well or defined in such a way that the business areas can use those definitions, we can measure how we've improved on the definition of the data. We can associate that with people's understanding of the data, which then relates directly to the quality of the data. So metadata seems to be the backbone of everything. I mean, the information that people have about the data will help them to define, produce, and use data in a more high-quality manner. All right. And organizationally, is it more effective to have the Enterprise Data Governance function be accountable to the IT function or the risk office? You know what? That's a great question. And you know what? It's different in different organizations. The first thing that I always say, it's got to reside somewhere. So, I mean, if it resides in IT, but it's not necessarily being sold as being an IT initiative for IT's sake only, then you have a lot better chance of succeeding than putting it within a business area. Then, you know, if you don't really want IT to run it for IT's purposes, you want to have heavy business involvement. You know, the idea of putting data governance under a risk management part of the organization oftentimes makes sense. It really depends on what you're trying to achieve with your data quality initiative or with your data governance initiative. If you're looking to improve compliance and regulatory concerns or looking to improve how you're protecting sensitive data and the privacy of the data, then maybe it makes sense for it to fall under the risk management part of your organization. It is really not one answer as to where it should reside. I will share this with you, though. I'm finding more and more often that data governance tends to reside under a shared services part of the organization because data governance is not necessarily specific to IT or specific to the business or specific to risk management. If it's in a shared services part of the organization, the likelihood that it can address all of those things is much higher. So I suggest at least consider that if your organization has things like shared business services that data governance would reside under that umbrella. How do you get the organization to raise their hand and say they have data quality issues, let alone want to fix them? Well, and purpose in your data governance program. And so how do you get them to submit issues? Well, that's a good question. If it's causing them enough of a problem in their daily operations and their daily routine, oftentimes they'll feel like they need a way and they need a way to be able to document what the data quality issues are. So you can provide incentives to people, you can recognize people for submitting data quality issues that are being addressed. There's a lot of different ways that you can kind of incentivize people who are data stewards to submit data quality issues. Those are just a couple of the different ways that I would suggest. But, you know, we want to make sure that there's no negative repercussions from entering data quality issues and so people don't get the idea that you can have them do it in a way that's anonymous so that they don't necessarily know who the person is that submitted the issue, but that's not really helpful because oftentimes you need to go back to that person and get clarification of what the issue is that they're documenting. There's a lot of different ways that you can kind of incentivize people to submit issues. Just make sure that you're constantly communicating with them that there is a way for them to submit issues. And, you know, you don't want to open the flood gates or open Pandora's Box, but oftentimes in the early phases of a data governance program, you collect a lot more issues than you're going to be able to address. So to me, it makes sense to provide that channel, but don't, and again, let people know that it's okay for them to take their time to document the quality issues that are causing them problems. Moving on to the framework a little bit, Bob. You know, our organization is fully enriched in the Agile framework. We have Agile come up a lot. Have you seen process product owners serve as data stewards in an organization that is largely focused on process and technology? It's a challenge getting a share of mine for the data. Thoughts on this? Yeah, you know what? That is an issue for a lot of organizations. I think I shared with you earlier that I had the book on non-invasive data governance, that if there's going to be another book, it's going to be on bridging the gap between data governance and Agile. You know, one of the things that I see in organizations are the Agile groups and the data people aren't necessarily sitting together in the lunch group. They're not necessarily talking. You know, the idea that what I've always suggested is that if you can get a data person involved, and even as a fly on the wall in the Agile meetings and they can document what the data issues are without slowing down or stopping the meetings, that's going to be a very important step for integrating data governance into an Agile effort. So having people from data governance or from the data management part of the organization involved in the Agile efforts helps to make certain that there's no issue that's swept under the rug. And that's what I would suggest, is if you can get your data people involved in the Agile scrums and the meetings, then at least there's somebody there to document potential issues around data quality. But that's an issue that a lot of organizations are running into, because they are seeing that Agile is the way that organizations want to move, but they also recognize that data is a valuable asset, and they need to include data people in the conversations, even if it's going to be following an Agile process. And you said that data is either metadata managed by itself, but there are tools for DG which features do you recommend? You know what, I like the tools that aid people with their relationships to the data. I like tools that will help you to improve the quality definition. So I've improved tools that can be used to provide collaboration with people in the organization. So tools that have those types of aspects to them become very important. And the ones that collect the important relationships between people and data, that becomes very important as well. I know some people say that if everybody's a data steward, then nobody's a data steward. I don't necessarily agree with that. There's different types of stewards that I mentioned earlier, but we want to make sure that we know how data is being defined, produced, and used across the organization, and if the metadata collected in the tool can do those things, then I say that it's a valuable tool for you to use. And if anyone has examples of pulling out, do you have any examples? This is asked to everybody in general, really. You have examples of pulling out the institutional knowledge into artifacts and documentation. One more time, is that a question? It's more of a request. If you have any examples to pull out the institutional knowledge into artifacts and documentation. You know what, I think that's a great question, actually, or a great comment. And so if you look at the artifacts that are provided as part of Shannon's follow-up, and you see things, you'll see that a lot of the information that's collected in those artifacts are metadata. So if you have any questions about those, please reach out to me through Dataversity, reach out to me directly, and I'll be glad to talk to you about ways that organizations have collected that knowledge and used it to improve the quality of the data within their organization. And I got so excited about all the great questions coming in. I'm afraid we are a little over the top of the hour there. But do keep the questions coming in. If you have more questions, I will get them over to Bob, who will write up answers for you for me to include in the follow-up email, which will go out by end of day Monday with links to the slides and links to the recording. And although I may send a little early, don't be surprised if you get a little early in Sandy and go next week Enterprise Data World. We'll try and get that done before the conference. So, but we'll for sure get it out by end of day Monday. No worries. So I hope everyone has a great day. I hope to see you next week. And thanks for everything. Thanks, Bob. All right. Thank you, Sharon. See you again, folks. Enjoy. Have a great day.