 Hello and welcome. My name is Shannon Kemp and I'm the Executive Editor of Data Diversity. We would like to thank you for joining the current installment of the Monthly Data Diversity Webinar Series, Real World Data Governance with Bob Steiner. Today Bob will be discussing data, governing metadata, vocabulary, dictionaries, and data. Just a couple of points to get us started. Due to the large number of people that attend these sessions, you will be muted during the webinar. For questions, we will be collecting them via the Q&A in the bottom right hand corner of the screen, or if you'd like to tweet, we encourage you to share highlights or questions by Twitter using hashtag RWDG, Real World Data Governance. As always, we will send a follow-up email within two business days containing links to the slides, the recording of the session, and additional information requested throughout the webinar. Now let me introduce to you our speaker for today, Bob Steiner. Bob is the President and Principal of KIT Confolting and Educational Services and the publisher of the data administration newsletter, TDAN.com. Bob has been a recipient of the Dana Professional Award for significant and demonstrable contributions to the data management industry. Bob specializes in non-invasive data governance, data stewardship, and metadata management solutions. And with that, I will give the floor to Bob to get today's webinar started. Hello and welcome. Thank you very much, Shannon. Thank you like usual. Like always, thank everybody for taking time out of your busy schedule to attend this webinar. I've been excited about this webinar ever since we put it on the calendar. I think it's a great topic for people that are interested in data governance, people that are interested in metadata management, how they're related together. One of the really neat things about this installment of the webinar series is that I'm going to kind of use a case study from a client of mine and not necessarily sharing exactly what I did with them, but giving you some thoughts of what has been successful for them. And hopefully you will find throughout the webinar that bits and pieces will really apply to you and your organization. So we're going to talk about governing metadata. We're going to talk about specifically governing vocabulary, dictionaries, and then the data aspect itself. So let's get started. Before I get started, typically what I do is I'd like to announce what the upcoming webinars in this series are. And in November, we've got another fantastic subject, which is agile data governance, how to apply governance to agile efforts. A lot of people seem to be interested in that subject, so I'll be glad to be talking about that. And in December, we've got another interesting subject. Everybody is starting to hear about this thing called the Internet of Things. We want to talk about the relationship between data governance and the Internet of Things and how it might impact you at some point in your experience of where you are working at the present moment. So the Internet of Things is a hot topic. We're going to talk about that in terms of data governance in the December webinar. Also, I wanted to share with you something that I'm a little excited about. Shannon is excited about. Dataversity is excited about is the reintroduction of the data administration newsletter, tdam.com. I've partnered up with Dataversy. Dataversy is producing the site. And I am the publishing site. In the October issue, there's lots of great articles about agile data documentation, about the role of the database administrator, about slowly changing dimensions, about data quality and data excellence. Please, if you get a moment, take a look at the data administration newsletter at tdam.com and register to receive notices and updates of the new and freshened updated material each month. Also, lay a quick note about kikconsulting.com, my website for my business. If you want to learn about non-invasive data governance, that website was just recently refurbished. So please take a look at that site if you've got a moment. A couple other real quick plugs. One is for the book on non-invasive data governance that was published a little bit more than a year ago. And also some upcoming Dataversity events that I will be involved in. So we've got one coming up just in a couple weeks in Chicago, Illinois, the Enterprise Dataversity 2015 conference where I'll be talking about a strategic framework, data framework based on data governance and governance best practices. And then in December, I'll be down in Fort Lauderdale speaking at the Data Governance Winter Conference 2015. And it's going to be a subject that's relatively similar to the one that we're talking about today, but we'll have much more of an opportunity to, since we'll have more time to dig into a little bit more depth of the case study and what it means to govern metadata and governing metadata within your organization. So I hope to see you at either one or both of those events. They are coming soon, so please put them on your calendars. The abstract for today's session is quite interesting. We're going to talk about governing metadata and we're going to talk about governance metadata. So what metadata is in your organization that will be extremely helpful to you when it comes to implementing your governance program? So I think that governance metadata is really easy to understand when you address it in three easy levels. And I'm going to talk about those three easy levels throughout the webinar today. Those levels are the semantic layer, the vocabulary layer, the business terminology that is used by your business. I'm going to share some examples of how meaningful that is to a couple of the clients that I've worked with, but how important it is to get the organization to speak the same language. One of the things that I have found to be extremely frustrating to management and companies is when they ask questions and they get different answers depending on who they ask because we're not all on the same page in understanding that the terms may be understood differently in different parts of the organization. They may be defined differently in different parts of the organization. But that semantic layer, that vocabulary layer is extremely important to helping to develop a data strategy and to improve the understanding and the cooperation around data in the organization. The second level is the business metadata layer. And that's pretty much the data dictionary. And then the example that I'm going to share with you, we're going to talk about data dictionaries in relationship to data warehousing in business intelligence environments and making certain that all of the data, whether it could be a master data management solution, that the data needs to be well-defined and we need to have business descriptions of that data. And then the third layer is the technical metadata layer, which is the actual data layer itself where we're talking about columns and databases and tables and the relationships between pieces of data, lineage and those types of things. So if we look at the data and the metadata that we want to govern from these three perspectives, it really gives us an overall good picture of things that we can start doing immediately within our organizations to manage our data and our information better. And the truth is that these three layers, they kind of stand alone but they also have a lot of things that relate to each other. So there's certainly relationship between business terms and other business terms, data dictionary entries, other data dictionary entries, data and other data, and there's relationships between the vocabulary, the dictionary level, the dictionary, the data level. So we're going to get into all of those as we proceed through the webinar today. So in this webinar what I'm going to do is I'm going to lay out for you an overall structure plus a structure for each of those tiers that I just discussed as well as their interactions alongside the, just by stand alone and as they demonstrated alongside other layers, I'm going to provide a simple schematic that you can use to kind of draw this out for people within your organization and hopefully you will feel that this hour is well spent and that we'll really address some of the topics that you want to talk about when it comes to governing metadata. So we're going to talk about the three-tiered approach to mastering metadata. We're going to talk about a description of metadata at each of those layers. Planning for the purchase of governance and metadata tools have just been through an episode or should I say through an experience with an organization of looking at both governance and metadata tools. So there's some things from that that I'd like to share with you. Processes for metadata change management. If we recognize that the three different tiers or levels of metadata are important, we want to make certain that we have change management and policy around when you can change vocabulary, when you can change dictionary, and when you can actually change the data itself. And last but not least, we'll talk about the role of communications in mastering metadata. So that is our agenda for today. Real quickly, at the beginning of each webinar I'd like to start with definitions. In case you're new to this webinar series, at least share with you my definition of what data governance is, my definition of data stewardship. And since we're talking about metadata itself, let's put a definition beyond just data about data to metadata. So let's start with data governance. Data governance in my opinion, in my terminology is it's the execution and enforcement of authority over the management of data and data related assets. And a lot of people tell me that that definition is worded quite strongly. I like it in the fact that it's worded that strongly. And in fact at the end of the day, no matter how we define data governance for our organization, we need to make certain that we do execute and enforce authority over the management of data. Some of my clients will take that definition and use it exactly the way that it's written there. Others will take it and tweak it and lighten it up a little bit so it doesn't sound so threatening. But the truth is we need to make certain that we execute and enforce authority over data. We can communicate, we can harmonize, we can orchestrate people in process and data, but at the end of the day, we need to make sure that we enforce that authority over the data. My definition of data stewardship may be a little bit different from other people's definition of data stewardship. In data stewards, we've done webinars in the past on the fact that everybody in the organization is a data steward and I told people to get over that and understand that that doesn't necessarily make the program much more complex. But the fact is that people in the organization that have a relationship to data, whether it's defining data, producing data or using data, need to be help formally accountable for how they define produce and use data. So I typically talk about data stewardship in the term of formalizing accountability over the management of data and data related assets. And last but not least in the definition category here, we've got metadata and most people know metadata to be data about data, but the definition that I use is that metadata is data that's collected in tools of information technology that improves the business and technical understanding of data and data related assets. So let's break that down real quickly before we jump into the session. Metadata, it is data. It is data that's collected in the tools and whether that tool is a paper napkin that you draw a model on or it's a data modeling tool or if it's a spreadsheet where you store definitions or it's a dictionary tool. Those are your tools of information technology and we need to make certain that in order for something to really become metadata it needs to be documented, it needs to be recorded somewhere. So it's really data that's collected in tools of information technology that improve the business and technical understanding of data and data related to data. Why is that important? Why do we have to have a definition of metadata that's clear to people within the organization? Well, certainly when we're talking about terminology, business terminology of the organization, whether you call it your semantic layer or your vocabulary, we need to make sure that we write these things down, that we make these things accessible to people, especially if we want them to all be available to their customers or their clients or their providers or their vendors or whoever it is, we want to make sure that people in the organization speak the same language, let people know and share it. So it's very important that the metadata itself is collected somewhere and we'll talk a little bit about what are some of the places where we can store it, what should we be considering when we're looking at different tools for that. We're not going to really talk about that a whole lot in this session today, but there are webinars that have done on that subject and I'll be glad to talk to you about it if you have questions about how we can state noninvasive with data governance. But in my opinion, noninvasive data governance is the practice of applying that formal accountability through a noninvasive framework of roles rather than being kind of in your face with data governance. We really want to be less threatening, we want people to understand that their accountability for the management of data is really related to their relationship to the data, whether they define or define and produce or define and produce and use. Different data, there is a certain level of accountability that our management will tell us that these people have to have. We can enforce authority over the management of that data, but we want to do it in such a way that's not threatening, that people are willing to adapt and absorb it into their culture of what they do on a day-to-day basis. If we're going to start talking about governing this metadata at these three different layers that I've laid out and I'm going to lay out a little bit further, we want to talk about looking at it from a high level architecture and a couple of things to consider when we're getting started in governing metadata. One of the things is we want to be realistic. We want to make sure that we're not going to try to boil the ocean and we're not going to try to solve every metadata problem that we have all at once. We want to make sure we take an incremental approach that we learn from the mistakes of what we're doing. We want to make sure that we focus on the technical resource that more and more in organizations as they want people to speak the same language. They want people to understand the data and where it comes from and how it can be used, that we focus on the business aspect of the metadata. We actually ask the business people in our environment what type of information can you use or could you use if it was available to you that would help you to understand the data and get better use and make better decisions based on the data. We want to focus on business metadata. In order to do that, we really need to involve the stakeholders, get them into a room, speak to them individually. I've had opportunities to try to get business or try to get metadata requirements out of people in a meeting. I actually found that it was more effective if we didn't use the term metadata in asking them what their requirements for metadata were all about. If we go to what are your metadata requirements, you're going to get a lot of blank stares. If you ask them what information can you use about the data that's going to help you to do your job in a more efficient and effective way, not even using the term metadata. Oftentimes it's like a spigot that somebody is going to open and all of a sudden they'll start telling you, I can't find the data. I spend 80% of my time looking for the data and 20% of the time doing what I'm paid to need to involve the stakeholders as we're defining what it means to have metadata in the organization and then to govern that metadata as part of the initiative. We also want to look at the scope and the levels of complexity when we're starting a govern metadata initiative. What I mean by scope and level of complexity is that you may have a specific problem that you want to solve, but some people may be looking at long-term vision. The reason why I put the words vision into quotes here is that a vision could be in one person's head, a vision could be in a steering committee's head, as long as it is written down and there is agreement and approval in the direction of the organization. We want to look at making certain that we go after that low-hanging fruit, things where we can solve some problems immediately, but we also want to look at our vision as to where does this fit into our overall culture and our overall business of managing the data as an asset of the organization. We want to look at the immediate needs. We also want to keep our eyes to the long-term vision of the organization and what it means to them to have information about the data and to govern the data as an asset of the organization. We need to look at the resources that we have available to us as well. Oftentimes we need to define the world as far as what you want to do with metadata and data governance, but unless you have resources that are associated with what you're doing, you're going to find that it could be quite difficult to make certain that you're governing the metadata the way that it needs to be governed. If you purchase a tool, do you have resources to do that? If you're going to develop a business vocabulary for your organization, do you have people that have time to pull together all the different business terms and terminology for the organization? So resource requirements are something that we need to look at when we're getting started as well. We also want to look at tools. I'm going to spend a little bit of time at the end of the session today talking about the different tools of the trade. Some organizations like to go after best of breed tools. They want to go after their data governance tool, so when we're looking at separate entities, some organizations want to look at it as more of a combined or single platform. So really when we're looking at the architecture around how we're going to govern even the three levels of metadata that I've talked about, we want to look at it from a tool perspective too. Are there some templates and things that you have seen in webinars like other webinars that I've done or in other organizations that have given that tools are available on the market? Certainly one of the things that we want to do is we want to demonstrate value quickly. I'm going to share with you a funny slide on low hanging fruit here in a minute. The idea is if we can demonstrate value in such a way that people in the organization are using the metadata and they trust the metadata and it's helping them to do their job, that's really the end result that we're looking for. If we can do that quickly by doing that, we can already be available to us like potentially vocabularies are already available to us or dictionaries are already available to us. We want to leverage those things moving forward. I'll share with you that funny slide in just one minute. But real quickly before we jump into everything here is the high level schematic that I mentioned earlier in the slides. It's got those three layers that I talked about. It's got the, let's see if my highlighter here works. It's got the vocabulary layer. It's got the vocabulary layer. We talked about that, the true business terminology that management of our company wants us to use when we're talking to our customers and to our vendors and to our partners and those types of things. We want to make sure that we're looking at the vocabulary, the semantic level, the terminology and really underscore that it's the business language that we use that is important for us all to have the same page. And therefore when somebody asks a question about something specifically in the organization, we all have the same understanding of what it is that they're asking. The second layer is part of the high level architecture is the data dictionary. The data dictionary is something that's very common in a lot of organizations. In the case study that I'm going to walk through real quickly, the data dictionary was for all of the data that we need to have or it could be all of the data in your data lake or all of your big data, whatever it is, we need to have definition for the physical data that people are going to access and we need to have it in business terminology. So that is the data dictionary, that business metadata, the business explanation of the data that are in the systems within our organization. And last but not least is the data layer. We need to make sure that we know where it resides as far as in what systems and what reference code tables want. What is the technical metadata that we have about our data? Just think about it. If we would go to our organization and we would give them the capability to be able to start at a definition, at a vocabulary or at a business term level and dig down to see what data is there that's available to me that helps me to answer questions about this business term and then to actually physically be able to dig down into the data itself and to see what rows and columns and views and databases have that information, that would be a very big benefit to business people in our organization as they're moving forward with again, governing and managing data as a valued asset. We certainly need to start with governing the metadata before we can get to the point where people have confidence in the data itself. That's the high level schematic. I'm going to share with you another view of that in a minute here that's actually taken from an organization and how they blew that out to include the relationships between things at each of the levels and we're going to talk about those in more detail here coming up in a minute. Here's the funny slide that I saw and I actually saw it that somebody had tweeted it really funny when you think about it because a lot of people do talk about going after the low hanging fruit first and the definition here is that now it's a misguided notion that it's possible to achieve something worthwhile with little or no effort. I wouldn't necessarily say it's little or no effort but we want to make certain that it's technically feasible to achieve what we're trying to set out to do and that we're not picking things that are so difficult at a time for us to be able to demonstrate value. So in this slide the thing that really became funny to me was instead of using the term low hanging fruit they suggest that we should use low hanging snake which is a voracious feature that eats low hanging fruit and is certain to develop a taste for human flesh if you don't deal with it as a matter of urgency. Well I wouldn't necessarily take it that far but I think it's important to think about it in terms of the environment whether it is business terminology that exists in if you're in the healthcare business it might be in your benefits handbook it might be in your provider handbook here in the other insurance industry it may be in your claim handbook or other things but in most organizations there are business terms that are commonly used across the organization that we're speaking the same language about. If we can gather a group of people to create a first cut out of vocabulary and we can get it approved at the highest level of the organization that's a real benefit to the organization so we'll talk about once we've collected that information what do we do with it and how do we relate it to other metadata that we care most about. So we're going to walk through each of the three tiers of the first level is the semantic layer the vocabulary or the business language that we use and what I wanted to do is I wanted to share some examples with you of what specific metadata you might include in your business vocabulary in your semantic layer and that would be the business term itself a business definition of the term I've always joked about things that I call cheese burger definitions what's a burger it's a burger with cheese what's a patient account it's the account of a patient now we need to have true business definitions especially for the items at the vocabulary or the semantic layer if we want people in the organization to get to the point that we're all speaking the same language we may also want to know what domain or subject area does that relate to in the organization so again if you're a healthcare industry you want to see all the claims or providers if you're in the manufacturing you want to see all the business terms that are associated with raw materials and production we also want to know besides for the subject area that that business term is related to we also want to know who's responsible for that term who's responsible for validating or even providing the initial definition of what that business term is and whether that's a business word or it's a data owner or however you define it in your organization we need to know who the person is that has accountability for making certain that that term is the way that it needs to that means going back to the handbook and making sure and validating that it's correct and that we're using the right terminology that could be necessary in your organization but if we collect the business term the definitions domain last updated that in itself is just a few pieces of metadata that we could collect about the business terminology at the semantic layer so let's kind of do that same analysis on the business metadata layer the data dictionary layer so really in this example the one that I'm going to share with you it's the data that's in their business intelligence environment or in your organization it might be the data that's in different systems or your master data environment or your big data environment but we need to have that data well defined well what metadata do we need at the business metadata layer you may need to know what do we call in this piece of data or this element of data what's the definition of that piece of data within the context of the database or the system that's being used in we want to know again what domain or subject area that piece of data is we want to know who's the person in the organization who is the domain steward or the subject matter expert who is the one that's making decisions based on the definition of that piece of data across the organization rather within a specific silo I refer to them as data domain stewards I've seen other organizations define them as data owners or data custodians or subject matter experts that are truly the subject matter experts for each of the subjects of data across your organization I've shared in the past that that is the most difficult piece to fill in when we're defining roles and responsibilities around data governance is that tactical domain steward level position let's also talk about the technical metadata and that's down at the data layer of the three layers that I just talked about that's the physical data structures the reference data and the reference data values and some of that metadata may include the database name table name column name these are real physical attributes of the data itself the view names the reference data in one organization that I worked with as they put policy in place to govern the data itself it was not just the database structures but if somebody wanted to add or change a value to a reference data a piece of reference data they also need to go through a look at it says do we need to add these values if we need to add these values perhaps there are some changes that also are going to take place at the vocabulary layer so if we're saying that now we've got new types of providers within an organization and so because of that we're creating new values in the reference code table and we may find that we're also changing the vocabulary of the organization that these three new types of providers are something that haven't been defined in the path so we want to make sure that when we're looking at changes to the different types of metadata that we look at the entire picture across all three layers rather than just where the requests for changes are being made so those are the three tiers there's the vocabulary tier the business metadata tier at the dictionary level and then the technical metadata that's down at the end that I showed you that basically has all three of those components in it let's take a look at that in a different way here and let's look at an example that one organization used to define how they were governing their metadata governing their governance metadata in fact around those three layers so in this organization they started with the vocabulary level in the business terms exactly the way that we've described it so far for each of their databases that were in their data warehouse so that we could link between the business terms and the data dictionaries themselves and then there was the data layer down below and the one thing that's really nice about this diagram in comparison to the other diagram that I shared is that this also highlights the relationships between the different types of metadata at the different layers the relationships between the within the same layer so it kind of gives you an overall picture it's very easy to talk to a diagram like this and say that this is the governance metadata that we need to collect and we need to put it somewhere that we can get it into the hands of business people and business end users so I hope a diagram like this is kind of helpful to you it takes that early schematic that I had drawn and kind of fills it out a little bit more and shows that data at those layers but there's relationships to them that are really important and helping people to understand where the data came from what data relates to a business term and all of those things I'll lay some of those out for you here in a couple slides as well I'm going to lay it out for you right here the relationships between vocabulary and vocabulary that was one of the things that was not shown on the previous slide but in one of the organizations that I worked with recently they decided that they wanted to group their vocabulary business terms so there was a vocabulary to vocabulary relationship so they were actually mapping business terms to other business terms so that people could get a general picture of all of the different terms that were related to a single subject matter within the organization if people are looking for all information about material and they want to see all the places in data resources where information about our raw materials are available we've got to relate the vocabulary to the data dictionary itself so what we're really doing is we're relating the term to the business data but we're really relating it to the business data through the metadata also want to talk about the relationships with the data dictionary I don't know about your organization but a lot of the organizations that I've worked in they use the same term in different ways depending on what application that data is stored in but they may call it the same thing so they actually may call it something completely different and we need to make certain that when we're talking in a certain term in our business metadata layer that we need to be able to look at the definitions depending on the context of that piece of data and we also look for a relationship between the data dictionary and the data itself where once we've identified where all the pieces of data or all the business terms of business metadata is about that specific business term we may also then want to be able to dig into the data itself and say show me the physical tables where I can go into the data of having these three different layers it's a matter of having these three different layers but having the linkages between them so that when people go into whatever tool it is that you're providing to them that they can navigate between the vocabulary and the dictionary between the dictionary and the data between the data and the data all of these things that I've laid out here not only the three layers but that makes sense to you. The tool decision so when we get to the point where we're governing our metadata and we need to take a look and we know that we've got the glossary excuse me or the vocabulary defined we know that we've got the dictionary defined and those two can both be works in progress so don't expect them to get completed quickly but we know that we know where our data is we can pull off data definition language and structures thinking about where are we going to store this information that's going to be useful to people within the organization so what are some of the things that we should consider when we're looking at these tools to govern our metadata to govern our vocabulary dictionaries and our data but the first thing that we want to do is we want to make sure that we're collecting metadata that's required to support the governance of those things and the organization has developed policy and procedure around changing those things and we need to know what metadata is required to support governance but we also need to know the workflow required to support governance so in an instance where somebody would be making a request to change something at the data level that might actually impact something at the vocabulary level we want to make sure that we know that we can involve people in a way that the tools on the market these days have excellent workflow features where we can even incorporate it into our e-mail so we can send notices to people and approvals to people through our e-mail so the workflow required to support governance is extremely important the placement of the tool within our existing environment is another really important consideration are we going to try to you know the truth is that if you're using a regulatory tool it's something like building a data warehouse where you've got disparate metadata coming from different places you want to make sure that you know what metadata you need from other tools so that placement within the environment becomes very important to you as you're defining what is the overall metadata picture not just the governance metadata picture but do we need to know transformation based on and those types of things is that metadata available to us within our environment we need to make certain that there's a commitment to funding to purchase and sustain a product it's one thing to purchase a product but that's only part of the total cost of ownership for a product there's resource cost there's renewal license cost and those types of things just make sure that there's a commitment if you're starting to look at that also there's information that's available from the industry analyst perspective that can help you in your tool decisions and so whether it's Gardner group or some of the other groups it's very important that you have access to information from the industry analyst there's also the resource requirements make sure that you have the people that you need in order to implement the tools that you're talking about so these are just a bunch of tool tools for your environment tool decisions best practices may be do we have an overall data strategy to find do we know where the metadata and the data governance tool is going to fit into the overall architecture are we packaging together requirements in such a way that it makes sense to the vendors when we're putting out an RFP to them and that's the next step is to find and validate a request for proposal that takes these items from our business community and packages them in such a way that it makes it easy for them to understand when they're responding to the RFP and then also to identify and target specific tools and vendors there's a lot of tools on the market we need to make sure that we have a plan of attack when we're looking to purchase a product that we will use to govern our metadata at the vocabulary dictionary and data layer some more considerations you want to select the tool and make sure you have the funding you want to engage in a proof of concept before you purchase a product and a lot of the vendors will tell you that they can do everything that you ask but the only way that we're really going to find out is we put together a solid plan for our proof of concept and we engage the vendor in the proof of concept before we make the decision to purchase the product. We're going to give you some considerations when you're putting best practice steps together for analyzing the tools that are in your marketplace. When you get to the point where you create that RFP to get information from the vendors to make a tool selection I wanted to share with you some metadata categories and some data governance categories that you may want to consider including information about their software releases. It's really good to know about the data about the data if that makes sense to you. It's really the metadata that comes from the metamodels and as a former repository administrator in an earlier life I knew that I studied the metamodels. I needed to know what information was going to be stored in the tool and knowing where it is to the metamodels we're going to be in the back end of the tool. We want to make sure that we have extensibility as being a category for an RFP. Is the tool extensible to collect things that don't come out of the box? We want to make sure that we can create self-defined loads meaning that if we have a list of domains and definitions of what the domains are that rather than we can put it into a spreadsheet and we can create a self-defined load and make certain that when you're looking at metadata tools you look at their ability to be able to do self-defined loads. Almost all of them have that capability but make sure that during your proof of concept that you focus on that because that's going to be one of the ways that you're going to find that it's easiest to get your metadata into the tool. We also know that we want to have a specific role for people that are requesting changes or that people have the authorization to change pieces of metadata within the metadata tool. We want to see how well the tool integrates to different processes and procedures that we have in place in our environment. These are just several metadata categories for an RFP. A couple more of those categories where can we control, first of all, change control of the items that are in the repository. Can we control versions? If we have two things that are called the same thing that mean something different, then we want to make sure that we can have different versions of the same thing or at least the same name within the product. Change control and versioning is very important. One thing I wanted to throw out there is that if you look in the layers of metadata categories for an RFP, the change control and versioning is important. Communications is important. The end user requirements, training and education, what type of training and education do these vendors provide for their tools and what are the resources that are required to make these tools operate the way that we want to within our organization. That's just a handful or maybe two hands full of metadata here real quick. There's the usability of the tool as a business glossary. There's the ability to create custom attributes and relationships between things. For example, if you use a schema similar to the one that I shared of the three different layers that we can create custom attributes and relationships between things that again might not come out of the box. We want to make sure that we can store information about our tools for however you define steward within your organization. We want to make sure that we can put approval workflows into the tools that we provide for allowable values for information in the tools. These are all different categories of metadata stuff that we need about data governance that we need to collect in the tool and make available to people. Again, to have a governed environment specifically around the metadata governance initiative. In some more data governance categories, can we store information about data lineage and impact analysis? Do we have a way of creating a hierarchy of data artifacts or profiling different diverse data sources across the organization? We want to make certain that we can record issues, data logs, we can track them through the resolution and we can report those in dashboards or status reports to people to show that our governance and our metadata initiative is having an impact on the organization. Support for internal audit and data governance metrics, we need to have a way to be able to demonstrate that while we're doing all these things to govern our organization, to govern our data, to govern our metadata that we have a way of being able to measure the success within the organization. What value, what business value is it bringing to people that said they were spending 80% of their time looking for data? Maybe now they're only spending 40% of their time looking for metadata or looking for data. The fact isn't the fact that you saved them X number of minutes or X percentage of their time, the question is what can they be doing with the time that they have now saved because this metadata is now available to people. There's different categories for data governance perspective, there's different categories from a perspective that kind of relate to each other that when we're putting together an RFP, we want to make sure that we're asking the vendors about their tools and how they handle these kinds of things. One of the last two subjects I want to go over real quickly is hey, it's one thing that we collect all this information and we put it in the tool, but it's not good if people have the ability to be able to change it or add new entries without going through it. In some organizations, they'll put together add change policies around corporate vocabulary and they'll put together separate add change policies around changes to the data dictionary or even at the data layer. The interesting example that I shared before where somebody wants to make a change in the data layer that's going to actually impact things on the vocabulary layer, we need to be able to navigate throughout our governance metadata to make sure that we're adding something as simple as adding a couple different values to a reference code table that we can reflect that at the business semantic layer in the organization as well. Sometimes it requires that we put a policy in place to make certain that people understand the policy and that we have somebody who governs that policy to make sure that it's being followed. Certainly if we can cut off the changes to the tables where people go directly to the DVAs and request changes and force them to go through a process where we're going to more clearly think out what changes we're making and that is a truly governed environment as compared to where people can go directly to the DVAs and say, add this column, I want to call it this, put it here. It's a lot different than saying, okay, well do we really need that field? Are we looking at the right place to put that field? Are we looking at the appropriate authority to change that to the basis? Are we looking at the right time to get it released to our business terminology? All of those things are important when it comes to governing the meta-data at the layers we're talking about here. So what are the steps that we need to follow in order to put a policy in place? Well most organizations have data policy to start with an existing of your activities in your organization, and ultimately it needs to be signed off at the highest level of the organization. So these are some really quick steps that you can take to develop policy around protecting things that are in our vocabulary dictionary and in the data itself. In the policy, you may have statements like this, the purpose of a corporate policy is to best apply a best practice change management process for additions and changes and requests to corporate, whatever, whether it's vocabulary, dictionary or data, the policy is put in place to assure auditable processes for the management of these things, their definition, their production and their usage within your organization. And then the policy impacts will be enforced for every person in the organization that's looking to change one of these things at these different layers within our organization. Now I'm a pit-for-a-guy, I've probably said that before, and our coach of our famous football team, the Steelers, says the standard is the standard. And I think he might mean it in a different sense than I'm using it here, but we need to have standards. We need to have policies. We need to have these things in place to improve the business understanding of our data and our metadata in our organization, and that needs to be governed itself as well. So this diagram is a high-level schematic of how that policy might actually look, where a requester makes an add-change request and gives it to a data governance manager who then bounces it off of the people within their circle of influence and maybe the circle of influence that's around their circle of influence to make sure that they truly understand what change is being requested. They compare it to the standard. They share it with the steering committee where a decision point is made, whether the change can take place and it's handed back to the completed change request back to the requester. If we can make certain that steps like this are being followed for any change that we make to vocabulary, dictionary, or data, that is a heck of a governed environment. And again, it's something that's relatively easy to put together if you can get people to provide vocabulary and dictionary and information about the data itself. So lastly, we're going to talk about the role of communications. We know that we need to understand what the end-users say that they need. And I put need in quotes there because they may not really know what they need, so that's where it really comes to the job of the data governance team to help them to be able to articulate what they need out of a tool, out of the metadata. We need to understand what they're going to use, what's going to help them in their job now, so there's a difference between what they need and what they're going to use. We're going to talk about the importance of early metadata requirements. I talked about that earlier, where we need to meet with stakeholders to understand what they need. So that's kind of the early metadata requirements. And then there's really importance of late hand-holding. What I mean by that is as people start to learn to use the vocabulary, that's a change. People may be used to using the terminology that they use. So if we can help to hold their hand and change things within request form so that we make certain that they are actually directed to the vocabulary, this is the kind of blessed terminology that we're using within the organization. Let's hold their hand for a little bit and make sure that they understand why it's necessary and help them to gain access to this information. So there's a bunch of different metadata and data governance communication types. We need to make them aware that the metadata is available. We want to know when they're going to use it and how they're going to use it. We want to provide training and education to them. We want to make sure that we put change management in place in order to truly govern our metadata environment. And I've shared in the past this model of communication plan and what I'm talking about is this one aspect of it which is about governance documentation and how that information needs to be shared with the different people in different roles across the organization, whether it's through the onboarding process, the orientation process, onboarding process or ongoing processes in your organization. If you'd like to talk to me more about this communication plan, I'll be glad to answer questions you have about it. Here's kind of a quick summary before I turn it back over to Shannon here on the things that we talked about today. We talked about the three-tiered approach to mastering metadata, the description of the metadata in each of those layers. We talked about a little bit about planning and the purchase of data governance and metadata tools, processes for managing metadata change and the role of communications in mastering metadata within our environment. And with that, I would like to turn it back over to Shannon to see if we have any questions from the webinar. I'm not sure that she's there. Well, you know what? In the case... I know that she was having some problems in staying logged in to the sound and then on the phone. So I'm going to take some of the questions. I'm here, Bob. Ah! There she is. Sorry. Having technical... Here we go. We've got lots of questions coming in, some great questions. We may not have time to get to it all today. However, keep the questions coming in as one of the great things about this particular webinar series, as Bob will write up answers to the questions that we don't have time to get to, and I'll get that out in the follow-up email with links to the slides, links to the recording and other information requests throughout the webinar. For this particular webinar, this will go out by end of day Monday. So let's get started, because there are just a ton of questions coming in. First question, Bob. Is there a concept of certification of data lineage, technical and business metadata? Well, actually, it's interesting. You use the word certification. So I look at two different things. I look at validation and I look at certification. So I'm not really sure what you mean. I'm going to assume that it's certification. So certification of data lineage would mean that you have actually validated how data gets from one place to the other. Certification would be that we get people to understand how the data move from one place to another. So let me make that distinction. The validation is, you know, let's make sure that we put the quality checks and the data movement checks in place and that the certification is that people understand where they can get access to that information about the certification. So I'm not sure that you're certifying the data movement or the ETL itself, but you're certainly validating it and then you're certifying people on their ability to be able to use that data. I hope that answers your question. And it came from Dan. Dan, you can certainly let us know if you have additional questions on that. And the next question, Bob, was in regards to your relationship between three tiers diagram. So back several slides. The question is, can you give examples of that slide, examples of that diagram? I'll be glad to. I have an organization that I'm working with that is looking to add a new business program. It's a healthcare insurance company. And they're looking to add a program that's going to do certain ways of episodes of care and those types of things in the health environment. And the change request that came in was we want to add these columns to these specific tables in order to make this program function effectively. Well, through further analysis, we found that, number one, we know that these new pieces of data are going to require definitions at the dictionary level because in this organization, anything that goes into the warehouse has to have a dictionary entry. So we know that just by asking to change some data down at the bottom level, that there's a relationship to the dictionary layer. But then we also realize that this program that's really being implemented that is causing all these changes to take place introduce some new terminology to the organization. And so not only are we changing data at the data level, we're changing data at the data dictionary level. We're also needing to look at the vocabulary to the organization and make certain that we've got all the appropriate terms associated with that new program that's being put in place. So that's just an example of how it's important to be able to navigate and to be able to make sure that when we're looking at impact analysis, we look at all three levels and do our due diligence and do our impact analysis rather than having them stand alone. All right, moving on to the next question. I'm trying to get, like I said, as many through as we can today, but keep them coming because we'll get those answers to you no matter what. What is the relationship types for the vocabulary to vocabulary relationships? Were they synonymous? Not necessarily synonymous, but the biggest relationship between things at the vocabulary level is almost like vocabulary groupings. So again, I'll go to use a healthcare example. So all of the terms that may be associated with the subject area of claim may be related to each other within the vocabulary. So if we're going to relate claims, claim number to claim date and claim diagnosis, those are all three things that might be described at the vocabulary level, but they may be grouped together under the category of claims. So that's the type of relationship that I'm talking about in relating business vocabulary things to other business vocabulary things. All right, moving on to the next question. Regarding tools, we always get this question. It's a very popular question, but more specifically in terms of tools at this particular person's company, we have our metadata structured as you propose business layer and data layer. Bob, have you found a tool that's good for what you're proposing? They've been struggling and have tried several different tools. You know what? It would be great to answer that question offline, but I'll give a quick answer in the webinar here, too. There are a bunch of tools that will handle all three of those layers. We'll handle the workflow. So I'm not going to suggest one tool over another. I've just went through an RFP process with a client where we went from four tools that we were looking at, the seven tools, down to five and then back up to six and then down to three until we finally analyzed it down to the one tool that we selected. The fact was that all of those vendors told us that they could do what we were asking for because we were very clear in the RFP. However, as we did our due diligence and our research of those vendors, we narrowed it down to a few. So I'm not going to point at one vendor or another to say that they'll do it better than the other. But if you want to have that conversation with me, please look me up and contact me and I'll be glad to answer those questions for you. We try to stay cool and elastic here as much as possible. Sorry, I didn't mean to interrupt you there. It's just kind of an extension and a different angle on the tool question. At this particular person's company, our enterprise architects have selected two different metadata tools, one for data and one for information. When it comes to the business terms and definitions, this person doesn't see the distinction between data and information. What are your thoughts on that? I have a very clear distinction between data and information. Okay, I say that if you take data, if you take something like one, two, three, four, five, and that's a piece of data, and you add some context to it, and you say that that context of one, two, three, four, five is a zip code or it's a dollar amount and what specific type of dollar amount it is, it only becomes information when the metadata is added to the data. So the definition of the context, everything about the data added to the data, that's when it really has meaning to the organization. So that's the relationship that I see between data and information. And I don't really necessarily understand why you would want to store that in different tools. I think people are going to want to be able to navigate through that within their own company. All right, well, I'm afraid that we're kind of out of time for any more questions, but like I said, one of the most popular, one of the best things about this webinar series is I will get those questions that we have to Bob and Bob will write out the answers to those. And that will also come out in the follow-up email, which will go out by end of day Monday with links to the slides, the recording of the session, as well as additional information from Bob. Bob, thank you so much for this great presentation today. And thank you so much to our attendees for being so engaged with everything we do and taking your time to attend our webinars. Bob, anything else we're interested in? Thanks very much, everybody. Look forward to seeing you next month. Thank you, Bob. Everyone has a great day. Take care now. Bye. Bye.