 Hello and welcome, my name is Shannon Kemp and I'm the Chief Digital Manager of Data Diversity. We'd like to thank you for joining the current installment of the Monthly Data Diversity Webinar Series, Real World Data Governance, with Bob Siner. Today, Bob will discuss using data governance to improve data understanding sponsored today by CouchBase. Just a couple of points to get us started. Due to the large number of people that attend these sessions, you will be muted during the webinar. If you'd like to chat with us or with each other, we certainly encourage you to do so. Click the chat icon in the upper right-hand corner for that feature. For questions, we'll be collecting them by the Q&A in the bottom right-hand corner of your screen. Or if you'd like to tweet, we encourage you to share our highlights or questions via Twitter using hashtag RWDG. As always, we will send a follow-up email within two business days containing links to the slides, the recording of the session, and additional information requested throughout the webinar. Now, let me turn it over to Allie for a word from our sponsor. Allie, hello and welcome. Hi there, Shannon. Thanks a ton. Welcome, everybody. Thanks for joining us today for the webinar. So I just wanted to quickly talk about the CouchBase Data Platform and specifically how to create really amazing interactions to drive your customer transactions. So another topic today is data governance. And this kind of falls into, you know, you have all this massive amount of data that's being generated in many different places. Well, how do you wrangle that data? How do you manage that data? And how do you get it into a place where you can actually drive insights into that data? So I always like to start off with this kind of interesting fact, right? Customers are now the most powerful, disruptive force-facing business. You may have seen the statistic that about 52% of the S&P 500 has disappeared in the last 15 years. So that's a lot of churn. And the most interesting part of that is the reason that's happening is because of customers. It's not really about pricing. It's not really about products or competition. It's about customers. And it's about how customers are influencing other customers. That's the most powerful force disrupting businesses today. And it's all driven by digital transformation. That's really the key to success here. But we have a problem because 90% of those digital transformation projects fail and they're failing today. And that's because they're not meeting the business case that justified that project in the first place. And behind that failure rate, interestingly, it's another statistic. And that's that a large part of the problem is what happens to the data that underpins everything you're trying to achieve for your customer. Because what makes your customer experience stand out is that when you can deliver something better that everyone else can as a result of using data intelligently. So here's why that's a problem today. Most likely your legacy infrastructure is focused on one part of the customer journey, the last part. And that's the part where it becomes a purchase or it's a video to be played or a flight to be booked. So we've built systems to house that data and those systems are called systems of record. And most likely your database environment today is largely based on your investment in that area. But now the majority of activity is seen before actually getting to that transaction. There are now vast numbers of interactions and associated experiences that shape a customer's view of an organization. And those occur across multiple touch points. We look a lot more than we book. We browse a lot more before we buy. So when your customers are spending more time interacting than actually transacting, well then every interaction and experience they have with your brand counts. A familiar statistic that we cite quite frequently, it's about 1,000 interactions to drive one single transaction. So that's a lot going on in the back end. So what does that mean? Well, we argue that old world systems aren't going to support this successfully, a new kind of database is needed. And we call this an engagement database. A database that's designed to support those customer interactions and all the massive amounts of data that's being generated. And that's what we've built here at CouchBase. So the engagement database co-exists with two other databases that most likely exist today within your ecosystem. Transactional databases exist for a reason. They need to provide that ongoing permanent record of what's occurred. But they're inflexible, they're rigid, and they can be really expensive. Analytical databases are also super important. They provide those insights. They sit behind those data lakes that you might need and more. But they're not fast, they're not like millisecond fast. And when they're not as fast as you need, you can't service your customers because they want data in real time. So today's topic is of course on data governance. And when I think about data governance, I think about this idea of many types of data being generated in many different ways and through many different channels. And how hard it is to manage all that data as a company. Let's talk about a specific example, something that we here at CouchBase see a lot within our customer base. So this idea of delivering a data solution that solves for the customer 360, or in other words, a single customer view. Well, this can be really challenging. Sure, it's easy for us to talk about what we need to achieve the ultimate customer experience. Of course, we need a real time single view of a customer that includes all integrated customer data in one place. We want to provide personalized experiences across all channels, devices, and business units all the time. And we want visibility into those customer interactions across our business units and across our customer touch points. But it's really hard to achieve these things behind the scenes. Namely because of course the legacy IT system that only give us partial views of customers. Those really faster changing customer expectations that are placing great amounts of pressure on the legacy IT systems. And consolidating all that massive amount of customer data from multiple sources is hard. And many times it's almost impossible. That's what we fall for. And this is just one example, of course. But one of our customers, Comcast specifically, was having a really hard time with customer support within their call center. So specifically they were collecting lots of different kinds of data like customer comments, trouble call comments, work orders. But this data was sitting in different buckets. So agents did not have that holistic view of their customers when they called in. And as you can well imagine, this made for a really poor customer experience. But with CouchBase, we made it possible to connect each of these point solutions, these disparate systems, into one platform, into the CouchBase data platform, to provide that unified customer view. So now those call agents have a holistic, real-time view of their customers when they call in. So by making it easy to capture, to manipulate, and to retrieve the data involved in every digital transaction, and every digital interaction, I should say, the CouchBase data platform allows our customers to build enterprise grade, customer 360 solutions that deliver an accurate, real-time view, unified view of customers across all channels and devices. And of course it's not restricted to one type of customer or one type of industry. We serve a wide range of companies, a wide range of industries, as we have a very broad technology. So with that, an engagement data place platform, at the end of the day, we help our customers leverage that massive amount of data that exists in the world, and apply that data to change people's experiences for the better. Our focus has been on helping our customers support their customers, and our data platform provides that solution. So with that, I'd like to hand it over to Bob, to dig more into data governance and what it means for you. Thanks. Allie, thank you so much. And before we turn it over to Bob, just let me give a quick introduction here and switch it over. So this is so great, and I love the splash at the end there of the final slide, and on your website, it's just, it's so cool. So, and thanks again to CouchBase for sponsoring, you can talk to Allie and the Q&A as well at the end of the presentation. Now let me introduce to you our speaker for today, Bob Siner. Bob is the president and principal of KIK Consulting and Educational Services, and the publisher of the data administration newsletter, TDan.com. Bob has been a recipient of the Damer Professional Award for its significant and demonstrable contributions to the data management industry. Bob specializes in non-invasive data governance, data stewardship, and metadata management solutions. And with that, I will give the floor to Bob to get today's webinar started. Hello, and welcome. Good afternoon, for most of you maybe, and good morning to the rest of you. Happy summer, since we just started summer yesterday, it's really good to have you on this webinar, on this instance or this session of the webinar. Today we're gonna talk about using data governance to improve data understanding. I'm telling you, I don't think that Allie's conversation or presentation could have been even more timely, because look at all the things that organizations are trying to do with customer data. I thought they heard of the expression they use powerful disruptive force, and that really goes to explain what the information that we have about our customer means to our organization. And you know what, one way that we can really improve how that data is being used, especially to support analytic type of adventures, should I say, within organizations, is to really get people to understand the data. So that's what we're gonna talk about today. We're gonna talk about using data governance to improve data understanding. I hope the information is helpful to you. Please stay interactive through the Q&A and through the chat, and we look forward to taking questions from you at the end of the session as well. Before I get started, I typically talk about some of the things that I'm involved in. I just wanna run through them real briefly with you. As you know, I do this real world data governance webinar series. I just wanted to highlight the topics for the next couple of months. Next month we're gonna be talking about improving data analytics with data governance. So we've talked about improving data protection and improving quality, or achieving quality, and now we're talking about improving understanding what we're gonna be improving data analytics. And that's really similar in a lot of ways to the conversation we're having today. In August I'll be talking about the non-invasive data governance framework. For those of you that are familiar with it, I look forward to presenting that topic every couple of years. So just on that topic, I also wrote a book called Non-Invasive Data Governance. You can find it at a lot of places. I will be speaking at a data diversity event coming up in Chicago, the Data Architecture Summit. I'll be doing two half-day sessions there. There's an online learning plan. If you're interested in learning more about non-invasive data governance, go to the Data Diversity Training Center. I provided the URL that you could go to if you use the TDAN code. You can save some money too. So please go there. A couple other things real quick. I am also the publisher of the Data Administration Newsletter. If you're not familiar with that, please check it out. It is free and there's lots of great information on that site. And then there's KIK Consulting and Educational Services. And as I call it, it's the home of non-invasive data governance. So enough about that. Let's move on to the things that we're gonna talk about today. There's really, the way that I structure these webinars is that I present, there's gonna be five subjects that I'm going to hit on. And these are really, really important when it comes to using governance and improving the understanding of data. And just think about it again in terms of what your company's trying to do with that data through insights and analytics, through customer 360, whatever it is that you're doing. You know, the metadata becomes really important. The basic understanding, the context of the data is critical to people. So we're gonna talk about metadata associated with improving understanding. We're gonna talk about selecting the, out of all the different types of metadata there are, selecting the appropriate metadata to help to improve the understanding of data. We're talking about the different processes that are available to be governed. That would focus on improving understanding. Everybody wants to talk about improvements in project ROI. So I'm gonna give a couple different examples of ways that you can do that. And then finally, we're gonna talk about measuring data understanding and whether or not it's really possible to be able to measure people's understanding of the data within the organization. I like to get started by talking about a couple different definitions that I use. I can't really spend a whole lot of time on it here, but it generates a lot of conversation. The words that are underlined are really the key. And I talk about non-invasive data governance, but I talk about data governance as being really the execution and enforcement of authority over the protection, should say, over the understanding of the data. Didn't change it from last month's slide. But the same thing that holds true. We really need, if we're going to improve the understanding of the data, and we're gonna do that through the metadata that we have available to us, or that we need to have available to us, we're gonna really need to be able to execute and enforce authority. A lot of people think that's worded too strongly. Maybe you wanna consider formalizing behavior or something like that, or guiding behavior rather than executing and enforcing authority. And I think of stewardship as really being the formalization of accountability. And so that's typically if people have a relationship to the data, whether they define producer or use data as part of their job, there's accountability that goes with each of those different relationships that you have to data. And so we need to be thinking about that. As I've been known to say, everybody in the organization potentially could be a data steward, or I basically in fact stated that everybody is a data steward, because everybody either defines producers or uses data as part of their job. The quick definition of data governance, I don't wanna read through the whole idea. You can see it really stresses some of the points I just raised. The goal is to be transparent, supportive and collaborative wherever we can be. And again, there's also, there's typically governance taking place within the organization, but it's very informal and it's very inefficient and ineffective. And if we want to formalize that, we can do that by taking an approach today to governance that really formalizes accountability versus handing it to people as feeling like it's something that's brand new to them. So there's a lot more information, as I said before, about non-invasive data governance. There's the book, there's the learning plan through the university. Please take a look at those if you're interested in learning more. You know what, when it comes down to it, there's really one single asset that we really need to improve on within our organizations if we're gonna improve the understanding of data throughout the organization. And that is metadata. In fact, it's metadata, metadata, metadata. So you get the idea. We need to improve on the metadata that we have available to improve the understanding of the data. So again, if we're gonna use our data in new and innovative ways, one of the key success factors is going to be people's understanding of the data. And a lot of times that focuses on metadata. So you'll see a couple of the topics I'm gonna talk about are related to metadata. And it's a big process of identifying what's the appropriate metadata, how do we get it into the hands of the people that use it? Is it of the right quality? All of those things we're gonna address in the webinar today. And we're gonna leave time for questions at the end as well. And I think I should define what metadata is when we're getting started. And the industry definition that everybody knows is that it's data about data. That's not the definition that I've been using and the definition that falls under data about data is the one that I've been using in webinars and presentations and with clients for many years. It's data that's stored somewhere that improves both, it's no longer just a technical tool, it's a business tool, and it improves technical and business understanding of the data. So that's what we're focusing on here is we're focusing on improving understanding of data. So if you take each of those sections of that definition of what metadata is and you dissect them, metadata needs to, it's data, we know that. So there needs to be governance around the metadata just like there is around the data itself. But it's gotta be stored somewhere. Even if it's drawn on a paper napkin, that may be your IT tool, let's hope not. But there's ways to store the metadata in different tools and it's gonna be used to improve business and technical understanding, as we said, of anything that you're recording information about. So one of the things that we should think about doing is talking to our end users about how they're gonna use the data and what information they can use or what would, if you made it available to them that it would make it easy for them to be able to use the data better through that improved understanding of the data. So there's, as I mentioned before, there's lots of different types of metadata to be covered. And we're gonna go through a bunch of them, we'll go through some of them. There's, as I said before, there's a lot of different types and we'll talk about maybe some of the considerations or the factors that might come into play as you're selecting which metadata is going to be most helpful for your organization to improve the understanding of the data. So we're gonna talk about how to pick the right metadata, like I said, go through the different tools and what type of metadata resides in those tools. But you ought to be asking yourself the question of which metadata will improve understanding the best? Out of all these different types of metadata, which ones are going to improve the understanding the way that you need to improve understanding within the organization? So which made the data will most likely be used by a greater number of people in the organization so basically those two questions are very similar. Which ones are gonna improve understanding the best and how many people can we get that in front of so that they can take advantage of understanding the data better? So as I mentioned, there's lots of different metadata to choose from. And Shannon mentioned the tdan.com, the data administration newsletter that's been published out there now for 21 years today. And actually tomorrow will be the 21st anniversary of the publication. So the article that I published in the most recent issue of tdan, or should I say actually the one from before today's publishing was called Questions Metadata Can Answer. So I think you can find that pretty easily on the site. But it goes through a list of different types of metadata and places where metadata is stored to improve value. So that's your database, your data model, your ETL tools, your business rules, ownership. All of these things are repositories of information that if only we could get it into the hands of the user or the stakeholders in the data, they could make better use of that information. So let's look at the different types of metadata in each one. And again, I don't wanna spend too much time on this slide, although I think a lot of conversation could be had around it. But there's metadata in your database tools. And think about that. So in DB2 or Oracle or Sybase or whatever different databases that you're using, the catalog of information is metadata. It's data about the data within the database tool. So if you're talking about physical metadata, about the columns and the tables and the views and those types of things, you're gonna find that within your database. Is that gonna improve understanding of data? I hope people tell you that that might be what the power users need to see. Or there's the data model metadata about the logical definition and the design of the data. Information about how data got from one place to another. So basically the ETL logic and the data movement. Is that gonna help people to understand the data better? And I don't think I can answer that question for you in this webinar, but at least kind of point out to you that there are lots of different places that metadata is stored that you can go to. Data stewardship metadata and application metadata and access and go to all these tools to get that different information that will improve the understanding or people's understanding of the data that they're using for all these innovative purposes. Exactly like the one that Ali talked about earlier, which is the customer engagement data. A lot of organizations are focusing on governing that data and identifying the metadata and the information that they can use to improve people's understanding of that data. So where are some of the places that we can look for that metadata in our organization? So first of all, we're one of those who will say that we don't have any metadata in our organization. Well, maybe you don't have any management of it, but I can pretty much assure you that in different tools that you're using that there's metadata that would improve the value of the data the value of the data and the understanding of the data. So look into your data modeling tools and your data movement tools, your repositories, all of these types of things, all of these places that you're using and that you're implementing across your organization, the information that you enter into those tools and the information that is there to enable people to do these innovative things with the data, they're all repositories and storage places for metadata. So look in the tools, look in the software, and there's a lot of different places that that metadata can exist. Then you might also want to look in places that you've created, so not necessarily the software itself. So remember before I said that it's data that's recorded in IT tools. Well, those tools could be of many different sorts for your organization, it could be spreadsheets, it could be racy matrices, it could be questionnaires, it could be the tool that I talk about a lot in this webinar series, the Common Data Matrix, which really inventories the data and provides the people aspect of the metadata. And that's really critical to improving understanding is who owns this data. Sometimes I hate the word owns because it implies things that I would prefer not to imply, maybe who stewards the data as a replacement for the term owner potentially. So there's metadata in your organization, I guarantee it. The question is unlocking that metadata and putting it somewhere where people in the organization can use it to gain value from the data. So I just wanted to kind of identify for you different places that the metadata exists. And so we know that in order to improve the understanding of the data, that we need to focus on the three aspects that I also talk about quite a bit, which is breaking down the different activities that people can take with data into definition activities, production activities, usage activities. And so what we ought to do is be thinking about, well, what information in the data definition metadata can we use to improve people's understanding? Or the same would hold true for how data is produced or where the data came from. And certainly, certainly, the no brainer of them all is on data usage when it comes to protecting sensitive data. And so we need to be able to share information about how data can be used. And I'm not only the definition and where the data came from, but we need to enforce rules, execute and enforce authority, so to speak, over the data and make sure that people are following the rules associated with this specific data that we're giving them improved understanding about. So the next subject that I wanted to talk about was different factors that are associated with selecting metadata. And I'm gonna go through each of these, and I think they're very important. But the item that I highlighted in the middle of the slide is really important, and you need to think about that. Don't be afraid to ask people what information they could use to improve their understanding of the data. I mean, you're gonna get a lot of ideas from conversations with these people. You might wanna come to them with a list of the types of metadata that are available and those types of things, but you gotta make sure that you're talking to your stakeholders about what information they need in order to improve their understanding. So what are some of the factors? Well, first is to just think about what metadata, what information can you provide that you think that will improve people's understanding, but then also ask them what they think would improve their understanding or what they feel would be the appropriate metadata. You need to look at where that metadata is, and you need to identify which of those types of metadata are available to you and your company. And what's the condition of the metadata? And I'll talk a little bit about, and I joke about something that I call a cheeseburger definition, which is the account number is a number of an account. It doesn't really tell you much, but you gotta look at the metadata itself and assess its quality before you start to share it with people because you wanna make sure that if you go through the effort to make the metadata available, that you have usable information to provide to people. So the other factors is who's responsible for that metadata and keeping that information up to date. What it takes to get the metadata from where it resides now into the people's hands that can use it, and what will it take to get people to start to embed data governance or to embed metadata into their everyday jobs? And so that last question might be one of the most difficult is to how do we get people to really use this information that we're providing to them? So the first factor was what metadata do you think is gonna improve their understanding and then what do they think is going to prepare or improve their understanding? So you might don't wanna consider what types of questions are people asking you. You know, if they wanna know where the data came from or if they wanna know what the business term is that's used for some specific type of data, look at the things that people have been providing you and use that as a basis for at least coming in early on with a list of the different types of metadata that are available to you and that you might wanna consider selecting as part of your improving understanding of data within the organization. And then go to them and ask them what they think will help to improve their understanding. And my suggestion is to kind of go into those conversations don't necessarily use the word metadata but really talk to them about what information could be provided to them that will make it valuable to them but go into the meeting with a plan and with a list and use that. List the different types of metadata that are available and which ones are better than others and things that can be used, things that might be easy or as a lot of people call it low hanging fruit. And then take a look at the effort that's gonna be required to get that metadata into the hands of people and look at the, you know again, judge the value that they're going to receive against the effort that it's gonna take to provide them that metadata and that will really tell you where to focus in an organization. So I oftentimes use a question and I have a client using it right now that they would ask the senior executives or senior people within different functions of the organization that in an ideal world what information could they use to do things that they want to do not necessarily that they have capabilities of doing right now but in an ideal world what data would they use? Well we can ask that same question about what information about the data would be helpful to people as well to again to improve their understanding. So look at what you think and look at what they think and kind of combine those to build out your list of metadata. The second question was well what metadata is available and where that metadata is located? So we know that we've got information about the database in the database catalog. It's usually pretty locked away unless you've got a tool that sits on top of that where you can provide that information to people but most of the business community may not have an interest in learning how to use a database catalog tool but that information is there that metadata is there in your database catalog in your data modeling tool all of this information about logical models maybe conceptual models maybe physical models are stored within your data modeling tool and then look at your reporting tools there's a lot of metadata that's associated there information security data access mapping dictionaries and glossaries certainly a very popular topic these days all of these different places have metadata. To be honest with you the dictionaries and glossaries that a lot of organizations are using are more of the spreadsheet, share point, word type documents and those become a little relatively easier than some of the other metadata to be able to provide so think about what it would take to extract metadata from your database catalog and put it into the hands of users and there's a lot of different steps that would be required to do that if you build those types of steps then you might want to consider building repeatable actions that you can use based on wherever the metadata is located but have kind of a reusable approach to how we're going to use or we're gonna be able to take advantage of metadata to improve understanding that's walked away in different places within the organization. So another one of the factors that we have to think about is the condition of the metadata that's available and I mentioned that before and truly the item that's in red is the bottom middle of the screen that's a very true statement that the condition of the metadata will oftentimes dictate the effort that's gonna be required to make that metadata available to people. So look at your metadata and if we're gonna focus on metadata to improve people's understanding then we wanna make sure that it's high quality that it's really going to answer the questions that people have about the data and there's dimensions of data quality that you've probably seen in a lot of different places. The metadata itself has to be accessible, accurate, complete, consistent, all of those types of things. So if you apply these different characteristics or these different dimensions to quality of data within your organization you can do the same thing with the metadata and take a look at the metadata and the tools and figure out what's gonna be of value and what's not and where we need to focus our effort for cleaning up that metadata. Another one of the factors is who's responsible for the quality of the metadata? And I mentioned it kind of briefly before where when we talk about data governance we also should be talking about the term metadata governance. I think there are people that are writing and talking about it, me being one of those people but we need to make certain that somebody has responsibility for the metadata. It will not govern itself. So we need to make sure that somebody has the responsibility. So when we talk about who's responsible for the quality of the metadata, as I had broken down earlier I talked about definers, producers and users. Well we need to focus on the metadata definers, producers and users. But once we've decided that we're gonna make certain metadata available the ownership of that and keeping that information up to date is going to be very important to the organization. So you need to figure out who the author who's gonna write the metadata, who's going to approve it, who's gonna be the person that keeps it up to date and who are the people that ultimately will be the end users of that metadata. Also wanna talk about is there a formal process associated with metadata? So I mentioned before that we wanna create repeatable processes to be used within organizations. A lot of those processes are gonna focus on the information about the data that's going to help to improve people's understanding and that is processes around the ownership of the metadata change. Management of the metadata as I mentioned before, providing access to the metadata are there certain people that should be able to see certain metadata and other people that should not have that same privilege or is it pretty much an open book for anybody to see? In a lot of organizations they want to keep that under wraps. There needs to be information security associated with the metadata that they're using to improve data understanding. One of the big questions again is what will it take to get the metadata into people's hands? And I mentioned that briefly. I wanna talk about it a little bit more. Excuse me, there's basically two different approaches that organizations use to get metadata into people's hands. And one of those ways is through what they call the push technologies and other pool. That was what they called them at one point in time. I don't know if those are still some of the same technology names but the push technology is to kind of push the metadata out to people. And that becomes difficult because then you have multiple copies of the metadata and everybody has their own version of the understanding because if things change and you need to repush the information out. But some organizations follow the push approach and some organizations use the pool approach. And what do I mean by pool approach? Well, if you have an internet site or somewhere within your organization that you use for your data governance program perhaps or for data management you might want to store or house your information somewhere where it's accessible by that internet site and make it available to people. And so you basically can push the idea out that there is information available in the repository or on the webpage but people would need to use the pool technology in order to get it out of the store of wherever you are storing that metadata. So consider the effort to extract the metadata, the effort to transform the metadata into something that's meaningful. You know, when if you have a PICS6 field and PICS6 field and people don't understand what that means it's not very useful to them. If you say that it's six characters in length that's a lot more meaningful to people. So take a look at the metadata and make sure that you understand that the efforts that's gonna be required to transform that metadata into something that's meaningful to people. Again, the effort to move the metadata from its source to the stakeholder. And my idea is that it makes sense in a lot of organizations if they have a SharePoint site or an intranet site to link to that metadata, link to that information. So there's a lot of different ways that we can take that metadata and put it into people's hands. We need to look at the one that makes sense to our organization. The last factor that I mentioned was really what will it take to get people in the organization to use metadata as part of their job? Well, one of the ways, sure ways of doing it is to formalize it and build it into process in the organization. So certainly in a lot of organizations, especially organizations that follow kind of an agile method, they've heard the term technical debt. There's data debt. There's things that really need to be addressed. We know that through some of these efforts that we can't take whatever information that we've collected about the understanding of data and make it available to people. But we can build it into the process. We can include data-focused people in agile meetings as one example to make sure that we're providing or building definition, production, and usages into each of the different processes. To make sure that at the end of the day, you may want to be quick and you may want to be able to respond very quickly to market demands and organizational demands. But the understanding of the data is critical. So we need to make the metadata part of the process if we are looking to improve the understanding of data. So making metadata easy to access and use, there are some tools that you can actually, in the query tool, right-click and it'll bring up a box that will give you linkage to information about the data. You know, things like that. Think about innovative ways to make the metadata, the understanding of the data, available to those people that are using the data as part of their job. So make the metadata valuable, communicate the availability of the metadata to the stakeholders, that's key. We need to include that in our communications and awareness plan around data governance. And I've spoken about communication plans before, but the availability of documentation is critical and that people know where it is and how they can locate it and what type of metadata is available to those people or to people in general. Communicate the availability of metadata to anybody who has the need to be able to use it. Teach the stakeholders how to use that information. So one of the problems, one of the challenges that a lot of organizations face is that people don't use the metadata if they don't really understand how to access it. So again, as part of one of this factor, which is what we'll take to get the metadata into people, to use it as part of their job, we need to teach the people that are going to use it, how to use it, how to access it, get them to participate in the definition of the metadata. It goes a great way towards being able to do innovative things with your data, insight, analytics, improve the customer experience and all of those types of things. So let's talk real quickly. I know I mentioned before about processes associated with governing the data, governing the metadata and I typically break it down into definition, production and usage. I challenge audiences all the time or whoever I get in front of to tell me an action that doesn't fall under one of those three categories. So if we're looking to simplify and understand, we wanna make sure that we do it in a way that's understandable to other people as well, not just to the data management population who in ourselves have our own terminology. But there's three actions people can take, definition, production and usage. And if we can govern definition and we can govern production and usage of the data, that's gonna be a great way towards success of governance in our organization. If you've listened to my webinars before, I have a pet peeve, I don't call things data governance processes because what I don't wanna do is I don't wanna point at data governance and say that's the reason why we're doing something. Rather, we're really talking about governing data processes and it really makes a big difference because we don't wanna point at data governance. It has enough hard time getting people to understand it and to think that it's important to their organization that if we start to call things data governance processes, we're making a big mistake because we're giving people reason to point at data governance and say, well, we're doing this because governance tells us we need to. And that's not usually shared in a very positive light. So let's go through the definition, production and usage of data, the data definition, the different processes associated with data definition. I have them listed for you here. Modeling, requirements, glossary, dictionary, rationalization, resolution. I guess the only one I really wanna focus on out of that list is the rationalization process. And so when we talk about the information that we have about the data in the organization, a lot of organizations talk about glossaries and talk about dictionaries and talk about metadata repositories that are all physical metadata. The rationalization is the kind of the matching and the mapping of one to the next. So you might have a business term that's related to a whole bunch of data dictionary standard elements and those elements might reside in different databases within the organization. So those links between things, between the business term and the metadata and the data dictionary, and then even within the dictionary itself where you're linking standards to what things are actually called in different systems within the organization, that's all data rationalization. So we wanna focus on making sure that we create those maps because it's important again to improve people's understanding of the data, they're not just gonna wanna know the definition. They're gonna wanna know the definition and where the data came from or the last time that it was updated or the percentage of confidence that they should have in that data. So what I suggest when it comes to governing the data definition process, and I talk about a data governance bill of rights from time to time, it's involved the right people at the right time for the right reason. And it's not a bill of like stewards rights, it's a bill of the right things to do within the organization. And that's really what we're doing when we apply governance to a process, we're making certain that we get the right people involved at the right time for the right reason. And again, provide standards for what must be included in the government definitions and don't provide cheeseburger definitions. Again, a burger with cheese, that doesn't really help. And a number for an account, really to provide a business definition requires that we have a process associated with involving the business and community and getting their feedback on the definition of the data. And oftentimes if we can use that definition, we're sure if those people are business people, we can assure that we're not creating cheeseburger definitions and we're improving our usage of business definitions. And again, why do we do that? And the reason is that we're doing it to improve data understanding within the organization. So the middle one is the data production processes. So what production processes? So oftentimes metadata or data comes in from the outside of the organization and there might be limited metadata that comes along with that data. In fact, in a lot of situations, that's the case. We need to make certain that as the data is being produced, we're providing people with an improved understanding of what that data means to the organization and what impact it has. So when you've got tellers at a bank who are very quickly entering information and they're not necessarily paying attention to quality, well, that quality comes back to haunt you in a lot of ways. Well, the same thing as holds true for customer data. We need to make certain that we're looking at how that data is being produced, where we're getting that data from and the metadata that's associated with improving people's understanding. I feel like I've said that a million times during the webinar, that's really what we're trying to do through the metadata is let people know the definition of the production and the usage of the data. So validate data production through process. I've seen many organizations create certification processes for data in their data lakes to prevent them from becoming data swamps. Data's not gonna make it into the data lake unless it goes through this process. And one of the steps of the process is to record the metadata associated with that specific data. So it turns away from being a swamp where people don't understand the data and there's just a mishmash of data. And I understand that's part of the purpose of the swamp, but to really make it usable to people that are users of that database, of that information, improving the understanding through metadata goes a great deal of the way towards getting you there. And the last of the three actions is the data usage. And as I said before, that's a no-brainer to the organization. I mean, we know that the people that use the data need to understand the data, they need to understand where it came from, they need to understand how it can be used, how it can't be used, can't be shared, can't be shared, all of those types of things. It really becomes important to have the information of data usage and that could be handling rules and data classification and things like that. So there's a lot of different types of metadata that are associated with how the data can be used within the organization. And so that's, again, one of the... We need to focus on the processes associated with data usage and make certain that people have access to the appropriate understanding of that data. So governing these processes will assure quality use, will assure protection of the data, successful integration, quality reporting, and the use of reports, all of those things are really important for the end user community. They need to know these things as well as some of the other things that I mentioned. So we want to look at processes like definition processes, production and usage processes, and apply the appropriate governance to them to make certain that the information, the metadata that we've been talking about in this session today, is available to the end users to improve their understanding. So I've got just a couple of minutes left and I'm going to turn it back to Shannon real quickly. I'm going to talk about improved understanding and how it leads to improvements in Project ROI. And there's really two ways. There's value from project-related data, which is the first way, and that's what most organizations look at. They look at the data that's collected within this effort and say, what's the value that's collected from that data and what value are we improving our customer base? Are we improving our customer experience, as Ali had talked to, and really taking the 360 view of the data that's really required these days to keep you one step ahead of your competition. And so again, if you weren't here at the beginning of the webinar, please go and look at the first five minutes because it's really important that they think that Ali talked about are very important. And they're talking about value from the efforts taken to build the data the way that it's being built. So there's value through the project-related data that's increased use or better use of the data in that initiative. And the next one is improvements in ROI associated with project management. I mean, a lot of organizations are applying project managers to their governance initiatives, at least when they're getting started because they want to make sure that they deliver on time and be successful in the way that they, and it follows the same methodology as other projects within the organization. Now at some point, data governance turns into a program. So it's not a question of when it's going to end because you really don't want the quality of the data to ever end. You don't want the understanding to ever end, the protection to ever end. So when you put data governance in place, you're really putting together a program rather than a project, but we can focus on project management. And by governing the processes associated with project management, we look for results in on-time delivery, maximizing the usage of resources, reduced re-work, and really overall is improving the customer experience, which as Ali mentioned and as I keep restating, is really, really critical to organizations these days because they want to know their customer better than their competition does. So the last subject I'm going to address, and then we're going to turn it over to Q&A, is how do we measure data understanding? So there's kind of two camps in this regard. There are some people that don't think it's possible to measure data understanding and or at least they think it's very difficult. They say that people really want to under, that they have to want to understand the data, to utilize the resources that are available to them. They only express the trouble that they're having trouble when they have trouble understanding the data. They're unwilling to quantify their understanding of the data, but the truth is that it is possible to measure data understanding. And the one pointer that I would give you on this is to start with benchmarks because you're going to be, at some point you're going to need to require what your future state looks like in comparison to your present state. So benchmarks are extremely important and you want to look at the time spent learning the data and learning about the metadata. You can use that as an indicator of data understanding. When more people are involved in these efforts, time spent massaging the metadata, locating the metadata, there are ways for you to be able to measure data understanding within your organization. Don't have a whole lot of time to kind of jump through these things, but the truth is that there is ways to be able, there are ways for you to be able to articulate to people in the organization how there's an improved understanding of data across the organization. One way is just to ask people in the organization how good is your understanding of the data now on a scale of one to 10 or one to 100 and then look at it in six months or however long it takes you to get going and say, well, what's your understanding of the data now and how does it compare to what we did before? So other measures of data understanding and I talk about key data resources. So you might have a data warehouse. You might have a master data management solution, a data lake, an ERP system. And so you may look at the percentage of these key data resources that have detailed understanding provided along with them. And so that could be an overall percentage of the data that's documented, percentage of the personal databases and spreadsheets and things that people still use those things and they're on people's desktops but the percentage of those that are documented, KPIs that are documented, mission critical data assets that are documented, all of those things are ways to be able to, are other ways of being able to measure data understanding and then the percentage of each of those resources that we spoke about that have the detailed understanding provided along with them. So that could be the warehouse, the lake, the key information systems or applications, insights and analytics data, data that would be behind your customer information system, dashboards and reports, looking at the percentage of that data, that understanding of that data is provided along with it, goes a great way towards being able to measure how people understand data within your organization. So there were really five subjects that I covered relatively quickly here. We first talked about the metadata that's really associated with improving the understanding of the data, the factors associated with looking at metadata, looking at processes to govern. We looked at how, very quickly, we looked at how the improved understanding kind of leads to improvement in Project ROI and then we talked about measuring the understanding of data across the organization and with that, I am going to say thank you and I'm gonna turn it back to Shannon to see if we have any questions for today. Bob, thank you so much for another great presentation, so much information and a really nice plan with Allie's presentation as well. And again, I will open the Q&A for both Allie and for Bob, just in that your Q&A questions in the bottom right hand corner of your screen and to answer the most commonly asked questions, I will send a follow up email for this presentation by end of Monday with links to the slides, links to the recording and anything else requested throughout, so diving right in here, are the stewards also the owners of all types of metadata? You have something to say on that, I can address that, I mean, I don't, I think that the different types, there's different types of stewards, there's not one type of steward across the organization, so there's certainly the operational data stewards who can either be definers, producers or users of the data and they can be, again, I don't like necessarily the word owner, but they can be the stewards of the metadata associated with defining, producing and using the data. Oftentimes I see another level of steward, which is a subject matter expert, a domain steward, and oftentimes they're there to make certain that the metadata is being collected, that it's of the right quality, that they're resolving issues associated with that specific data that they're improving understanding of. So I wouldn't say that all stewards are responsible are the owners of all metadata, but I would say depending on people's relationship to the data, that's going to dictate the specific metadata that they're going to focus on. Allie, I don't know if you have anything to add, just a reminder, you got a mute button to the right of your name there. Nope, I think Bob addressed it. Perfect, so a lot of questions here on metadata, what's the best approach to manage data lineage metadata? Again, Allie, do you want to comment or should I certainly have a, I mean, so wait, repeat the question once again? Okay, go ahead and repeat the question again, Shannon, I'm sorry. So what is the best approach to managing data lineage metadata? Ooh, that's great, and so that's a very good question. You know, oftentimes that information is stored within a data movement tool or an ETL tool, so being able to identify what metadata is kind of locked away onto the covers in your data movement tool is really important if you're going to get that information out and kind of use that metadata to improve people's understanding of where the data came from. There oftentimes organizations will use spreadsheets and things to map data from one place to somewhere else, so that could be considered data lineage metadata in some organization. And so what does it take to manage that? It takes that there's somebody who's responsible for it, that there's process to make certain that it's being collected and that it's being reviewed and all of those things and that there's process then that's available to make that information available to people. So lineage metadata is kind of tricky because it's not necessarily always straightforward, so it might take, as I mentioned earlier, some attempt at manipulation or massaging of that metadata to make it useful to people. So what's the best approach to manage data? Oh, excuse me, I'm just going to repeat the same question there. How would metadata for unstructured data differ from that of structured data? You know what, that's a great question too. Lots of great questions. Unstructured data is data, at least in my definition, it's data that's not housed in a database or in a file or something like that. It's documents, it's recordings, it's audio, it's video, it's all of those types of things, all the different ways that you have data coming at you. So unstructured data, it may not be as straightforward as what I presented in this webinar. I really focused on the data that's in tables and columns, but unstructured data, again, you need to know the author, you need to know the format, you need to know where it came from, you need to know who's responsible for it, all of those things, a lot of the same things that we need to govern around structured data are the same things that we need to manage around unstructured data, however, the terminology that we use to describe those things might be somewhat different. But if you have a good handle on structured data, metadata, try to apply that wherever you can to unstructured data as well, because as we all know, unstructured data is increasing in size and volume and speed and all the things they talk about with big data. Unstructured data certainly needs to be governed the same way to improve people's understanding as to what unstructured data is available to them. And Bob, this is Ali, I'll add on to that. And specifically, when we talk about unstructured data, CouchBase is a class of database under the NoSQL non-relational database sphere, if you will. So this is where we really see the NoSQL characteristics performed super well. So to the point I made at the beginning, all this unstructured types of data that customers are generating, interacting with businesses, it's really, really hard to capture. Non-relational systems make it a little bit easier to capture that. For example, CouchBase, we store data in JSON documents. So adding new schema on the fly is easy. And so for those sorts of use cases where you're seeing the explosions of data, especially at the edge, where you may or may not have, in the beginning of building your application decided to capture that data, well, let's say you decide you want to capture new type of data, we make it really easy to do that. So to add on to what you said, you should still be using the same sorts of mechanisms that you have built in to capture the data governance, the metadata associated with that. You can do it the same way. We just make it easier from a unstructured data perspective. And you know what, I think that is a great example to the people that are listening to the webinar today is that when the tools start to consider the fact that a lot of the data that you have is unstructured and they provide you the ability to be able to capture and make that information available to people, we will now be able to do things that we couldn't do before, which is really I think the whole point of what Allie talked about is the innovative ways to do insights and analytics around customer data and there's around other types of data as well, but being able to make that unstructured data available to people becomes critical. And I think that's the direction that the world is really going. Absolutely. Well, we have one more question. I think we have time for just one more. So that's perfect. Is data governance significantly impacted by global data protection regulation in the European Union? So GDPR, it's a hot topic. Well, I mean, the answer to that question is a resounding yes. It is very much involved in protecting sensitive data. The webinar I did last month, if you go back and you look for that, it was about protecting sensitive data using data governance. Yes, GDPR, if you're an organization that shares data with people or have interaction with people in the European Union, and that's a lot of different organizations, then it's important to know what rules are associated with protecting that data. So I've often said that protecting data is one of probably the easiest ways to be able to, and in fact, the article in this month, this week's edition of TDAN talks about using data governance to protect sensitive data, to protect data, basically. Yes, GDPR and the rules associated with that have to be followed. It's not a, do you want to do this? It's that you have to do this. And so it becomes very much the focal point for a lot of organizations' data governance efforts is to make certain that the rules associated with protecting that data of people in the European Union, and that's just to start, to make sure that those rules are being communicated and being enforced. Alrighty. Well, thank you so much to Bob and Allie, and thanks to all of our attendees for being so engaged in everything we do. We just love it. And just a reminder, I will send a follow-up email by end of day Monday with links to the slides and links to the recording of this session. And Bob, thank you so much as always. And Allie, thank you for joining us, and thanks to CouchBase for sponsoring this month and helping to make these webinars happen. We really appreciate it. All right, everybody, thank you so much, and I hope you have a great day. Thank you. Thank you, everybody. Thank you, CouchBase.