 Hello and welcome, my name is Shannon Kemp and I'm the Chief Digital Manager of Data Diversity. Hope everyone is staying well and safe out there, and we'd like to thank you for joining the current installment of the Monthly Data Diversity Webinar Series, Real-World Data Governance with Bob Siner. Today, Bob will be discussing data governance and three levels of metadata management. Just a couple of points to get us started. Due to the large number of people that attend these sessions, he will be muted during the webinar. If you'd like to chat with us or with each other, we certainly encourage you to do so. Just click the chat icon in the bottom right-hand corner of your screen for that feature. And for questions, we will be collecting them by the Q&A in the bottom right-hand corner of your screen or via Twitter using hashtag RWDG. And as always, we will send a follow-up email within two business days, containing links to the slides, the recording of the session, and additional information requested throughout the webinar. Now let me introduce to you our speaker for the series, Bob Siner. Bob is the President and Principal of KIK Consulting and Educational Services and the Publisher of the Data Administration Newsletter, tdan.com. Bob has been a recipient of the DAMA Professional Award for Significant and Demonstratable Contributions to the Data Management Industry. Bob specializes in Non-Invasive Data Governance, Data Stewardship and Metadata Management Solutions. And with that, I will give the floor to Bob to get today's webinar started. Bob, hello and welcome. Hi. Shannon. Hi, everybody. Hi. I'm going to echo what Shannon said and echo that I hope everybody is staying safe and well where you are. And I greatly appreciate your spending the next hour with us. And we've got a great topic for today. We're going to be talking about data governance and the three levels of metadata management as Shannon has spoken about or as has told us already. So to get us started today, let me see, there we go. Before I get started, what I'd like to do is share a couple of things that I typically share with you before the webinar starts. First of all, thank you for attending the Real World Data Governance webinar series. It's on the third Thursday of every month. And next month, we're going to be talking about using tools to advance your data governance program. Also, I talk a lot about non-invasive data governance. And if you're interested in learning more about non-invasive data governance, you can find the book called Non-Invasive Data Governance, The Path of Least Resistance and Greatest Success. And you can find it at your favorite bookseller. I'll be speaking at a couple of diversity events that are coming up soon. Enterprise Data World taking place in October this year. Next month, I'll be speaking at that event. And I'll also be speaking at the DGIQ Data Governance and Information Quality Conference. These are virtual events, and that will be taking place in December. Want to let you know, and I think I had talked about the fact that it was going to be coming out soon in the last webinar, but I now have a new learning plan that's available through the Dataversity Training Center. And it is called Glossary Dictionaries and Catalogs. And it'll be particularly of interest to you because it's a big topic of the conversation in today's webinar. We'll be talking about business glossaries and dictionaries and catalogs as kind of being those three levels of metadata management that we're going to be referring to throughout the webinar. The Data Administration newsletter, if you're not familiar with it, please go out to tdan.com. It's free. There's lots of great information on there. It's been I've been publishing this publication since 1997. And around KIK Consulting and Educational Services, I call that the home of non-invasive data governance. And a couple of things to say about that. If you notice one thing different about this slide or the slides, I've got a new logo for KIK Consulting. And the fact is that next week we'll be unveiling a new version of the KIK Consulting and Educational Services website. So please take a look at kikconsulting.com next week and you'll learn a lot more about non-invasive data governance, non-invasive metadata governance, and you just might find some things out there about glossaries, dictionaries, and data catalogs as well. So in today's webinar, I'm going to share these topics with you. I'm going to talk about these things. First, I'm going to talk about the three levels of metadata and how they differ from each other and how there may be some similarities between them as well. We'll talk about the sources of the metadata, the places that you can go to get the metadata to complete or to fulfill each of those different levels of metadata. I'll be talking about the linkage between the levels. So it's important to link up your glossaries and your dictionaries and your data catalogs. And I'll share examples of that with you. I'll talk about processes to govern all levels of metadata and then talk about institutionalizing policy to make certain that you have quality metadata at all of these levels and actually all of the levels of metadata that you'll be recording in your repositories or in your data catalogs or in your tools. Another object that I will be presenting today for the first time is a new semantic framework associated with these three levels and it may help you to put your organization in a better position to focus on these three different levels of metadata. I'll be sharing that with you coming up here in a couple of minutes. So the first thing that I want to talk about today are the three levels of metadata and how they are different from each other and how they are the same. And I want to talk a little bit about the advantages of using the different levels. A common term that's used happens to be common business language and a lot of organizations are talking about getting the terminology and the semantics to be consistent across the organizations. I'll talk about common business language. I'll share with you that semantic framework that I was just mentioning and talk about the three levels that are associated with it. We'll talk about the importance of each of the different levels and we'll talk about associating the levels with the different tools. And the reason why you see the number three and then the number four and then the number three again on the screen is that I've gone back and forth. Do I want to add a fourth level of metadata management and I think that we can keep it to the three and I'll relate the fourth to it when I share the framework with you. So what are some of the reasons why we want to have different levels of metadata? Well, first thing that we want to do is simplify the understanding for the organization. All of us that have focused on metadata management or who need to focus on metadata management understand that there's a lot of different types of metadata that are out there. And if we kind of try to present that picture to people within the business or even in the technical community, there's a lot of confusion and they think that it's very complex. So if we can simplify the understanding for people across the organization, we certainly want to do that. And not only do we want to do that, but we want to imitate what the traditional viewpoint is of the most important metadata that we can collect to assist us in moving forward with governing data across our organization. So we certainly want to make it something that the people in the business areas can refer to and that it's relatable to them and what they do. We want to keep in mind that we're capturing this metadata for a purpose. We're capturing it to make it available to people in the organization. So we need to think about or keep in mind the idea of navigation. How are people going to come into the metadata? How are they going to navigate from one level to the other? How are they going to find the information about the data that they need to be helpful and successful in completing their job function, creating reports and whatever it is that they have to do? And you'll notice that when I show the framework that there's a specific column set up for standards. And then there's other columns that are set up for resources. And I'll talk about that in a little bit, but when it comes to business terminology, we want a standard. When it comes to the attributes and the elements of data that we care about in our organization, we want to have standards. But the fact is that the different resources, the reports and the systems that we have throughout our environment, they might not call it the same thing. So we need to be able to relate these standard terms and what they're called. And we're going to need to relate them to the aliases for these things across the organization. It's called this in one system. It's called that in another system. It shows up as something completely different in a report. So we want to keep that in mind when we are building out the three levels of metadata. So I want to kind of begin by talking about common business language and how it becomes so important for organizations to get on the same page and call things the same thing. Is it a customer? Is it a client? Is it a prospect? When does a prospect become a client? All of those things are inherent in the definition of these common business language that we use in our organizations to improve the value and the understanding of data across the organization. So I'm sharing with you a definition that actually came from a very old page on the data administration newsletter, a definition of what common business language is. And I'll just kind of walk through it real quickly with you. It's a common business language that enables companies to drive consistent, well understood metrics across the organization helping to reduce the time and budget required to analyze the information and actually turn it into something that is usable within the organization. So it's really, we can take that definition and break it into pieces, driving to consistency. I think that's what we mean by the common, the word common in common business language is we want people to call things the same thing as much as we can so that when people ask questions, they're getting answers that are similar based on common understanding across the organization. So driving to consistency, that's one of the whole reasons why organizations focus on having a common business language. And it also helps us to understand when we're performing metrics or we're looking to measure the effectiveness of governance or whatever we're measuring within the organization, we want to be talking about the same thing. So common business language is really inherent in this whole framework. We need to make certain that we have a standard and that we know what it is associated with actually out there in the reality of our databases and systems and on reports. And we want it to be consistent across the organization. The idea is to reduce the time and the budget that's required to perform the analysis on the data. People talk about the 80-20 role, how people spend 80% of their time manipulating the data and getting the data the way that they need it in order to do the analysis that they are so well versed in. So we want to reduce that time and therefore the cost of getting that data to the point where people can analyze it and turn that information into intelligence within the organization. So what I'm going to share with you, the semantic framework and the three levels, well, there's basically the three levels are the business terminology. Again, going back to the common business language, the business attribute, which is kind of a breakdown of the different terminology. What information do we have about a customer? What information do we have about a product? Or what groups of information do we have? And then what specific elements of information and data do we have that pertain to that attribute group and as it pertains to the term? So you'll see that the linkage between these things and the relationships between the levels will become vital in making this information helpful to the people in the organization that we're hoping will get the most value out of it. So you ready? I was going to have a recording of a drum roll here, but here it is. Here is the new framework that I want to speak to. And you'll see it's rather simple in the way that it's set up. You've got the far left column, you've got the two columns to the right, and the first column specifically is about the standards. What are we going to call something? What is the subject area of information? What is the terminology that we're using across the organization? What are the attributes or what is an attribute of that specific business term? And then how is that represented in the systems themselves? So if you see the different colors associated with each of these different blocks, we've got the data tool. So we're going to talk about data domains here in a minute, but then we've got the tool of the glossary and the dictionary and the catalog. And we've got standards for what we want to call these things. How do we want to name data elements across the organization as an example? But then we've got this information that already resides in different resources across the organization. So you've got a data dictionary for system one and you've got reports that are associated with system one and they might not be called the same thing that is the standard, but hey, wouldn't it be great if business people could navigate from the subject area into the glossary, into the dictionary, and then into the catalog? And you've got multiple systems that potentially house the same data or similarly at least conceptual data across the organization. Things that are where we have silos of information that have different names and may have some different meanings or different values across the organization. So this is the simple framework that I wanted to share with you and it becomes really valuable to be able to plug in examples of what subject areas of data are we talking about? What are the business terms? What are the attributes? And what are the elements that are associated with those attributes? And that really plays well into the whole concept of the business glossary and the data dictionary and then the data catalog. The fact is the data catalog can be a lot bigger than just the specifics about data within a specific data resource. But for this example, I put catalog there as being kind of the physical representation of the data that we have as attributes and then we have as terminology and within a subject area within our organization. So if you look just at the left side of that diagram, and that's what I'm showing on the screen now, is that it's very important and it requires that we have governance in place for us to be able to create standards for these things. I have a client that I'm working with now that is struggling between market and segment and vertical. They have all these different names for what they're calling the different potential clients that they have out there in the world, basically. And when people want to know, well, what are our verticals? What are our markets? What are the segments that we are focusing on? It's important to, again, come up with a common business language and capture that information. And we'll just use as an example here on the screen that we'll talk about market and then the market profile. What are the things that we need to know about the market? And then what are the physical pieces of data that we have that allude to the market and the market profile for the organization? So that's just the far left-hand side of the framework. And if we just take away the top level, that's the domain. That's the subject area. We don't necessarily need a level of metadata for that. So even though I say it was three and then I thought about, well, do we want to add domain as the fourth? No, we'll keep it at the three. But I thought it was worthwhile to share that domain or that subject area within the framework that I'm sharing with you now. So when we look at the bottom part of the framework, that's where the business attributes. We would have something called market code that has a specific size. And that size perhaps could pertain to the business element. It would be up to you as to how you would represent that within the standard. But then we've got the different resources in the organization where this data is already residing on reports within databases. So I call it a resource attribute. I call it a resource element. But the fact is that that could be a report. That could be a database. That could be an information system. And oftentimes the data dictionaries really are outlining, well, what data resides within the data warehouse or within a specific application. Very rarely, at least from my experience, have I seen a single data dictionary for the entire organization. Typically a data dictionary is very resource specific. And then the catalog of information would also be specific to the database or the report that you are focusing on. And you'll see that there's linkages between these things. And we'll talk a little bit more about those in a couple of minutes here. So let's talk about the importance of each of these different levels. So I kind of grayed out the subject area. These are the buckets of data. These are the things that the subject areas, the subject matters, the domains of data that are used. Again, you can just refer to them as the different sets of data, the buckets of data around the organization. And typically that's where the business community is going to start. What information do we have about this? Where, what's the standard? What do we call it in this specific system? So when I'm running a report, I know where to get to that information. So the first level, that top piece that's not even truly a level, the subject area is important because that's where navigation starts in the organization. And then it comes to the terminology. And I spoke a little bit about common business language and terminology a little bit earlier in the webinar. So we need to know that we need to have that common business language in those business terms. And I'll talk in a minute about where are we going to get these terms from? What is a process that we might put in place to assure that this terminology is appropriate for the organization? And then there's the attributes themselves. Those are going to be specific to a business resource, like a system, like a report. If I'm going to pull up this report, which column do I look for on the report? Or if it's a graphical database, where do I look on the graphical database to find the information about this specific attribute? And then the data elements are, again, data resource specific. It could be the information of what is that column called? What is that field called within a database or a table or within a report? So I know there's four levels here. And the domain isn't really a level, but it's truly important if we're going to give people the ability to be able to navigate down through the three levels of metadata to get to the information that they need to help them to locate and understand the most appropriate data that's available to them to help them to perform their job function. So let's associate these levels with the tools. Again, the data domains, they're typically included in all three, the glossary, the dictionary, and the catalog. But the terminology, at least in the organizations that I've had the privilege of working with, that's the business glossary. That's the common business language in the terminology is housed in the business glossary. And in the data dictionary, that's where we're going to store that specific information about that specific resource, whether it's a system or a report or a database or whatever it is that is a set of information that's being utilized by the business. And then the data catalog, being the third of those three tools, is the data resource specific. And that's the one, and we'll talk a little bit about it in a minute, where if we can, we'd like to automate the ability for the connectors between the resources, the database catalogs, and the things that we have in our organization, and the systems and the models, and relate those and actually ingest that information into the data catalog. So those three levels actually perfectly align with the whole concept of a business glossary, common business language, the dictionary, about a specific resource, and then the catalog. What is the physical representation of that information across the organization? So now that I've outlined the three different levels for you, I want to talk a little bit about where can we go to get this information to populate our glossary and our dictionary and our catalog. So the truth is, there might already be resources within your organization that you can go to, to pull the metadata out of these sources and load them into your glossaries, dictionaries, and catalogs, or maybe those don't exist. So I had a recent conversation with a client that was quite interesting, where they told me they don't have any metadata, and they happened to be a very big DB2 Oracle SQL Server shop, and they have a lot of data in these different types of databases, and they don't necessarily, or they didn't recognize, that the database catalogs for these tools are metadata. So I had to argue with them, or at least disagree lightly with them, and tell them that, yeah, you do have metadata. The problem is that it's inherent in these tools, and if we want to make this information available to people so they can get the most value out of their data, we need to get that metadata out of those tools, at least the specific metadata that's going to add value for the business community. So don't say that there's no sources. I'm going to share with you a couple different sources for each of these different types of tools that you might be considering. Again, the glossary, the dictionary, and the catalog. And I'll spend a minute talking about preparing to use these tools. Well, what can we do in the meantime while we're waiting to see, or we're trying to get budget to bring in these tools, or get the opportunity to leverage some of the tools that we have within our organizations? So we want to make certain that we're thinking in advance. We're being consistent in how we're capturing this information so that we actually have it prepared so that when we bring in a tool, that it will be easy to ingest that information into the tool. So like I said, there could be industry related terms. The most important thing is that you begin a list. Begin a list of the subject areas within your organization. Begin a list of the term and build a list of the terminology that's important. I'll talk about where you might get that list from in a second. Capturing the list in an ingestible format, that's what I was talking about, is put it into some type of a format that you're going to be able to bring into your repository or into your catalog or dictionary or glossary tool. And utilize your existing tools like spreadsheets are a very simple way to be able to capture the metadata in a consistent manner so you can prepare yourself to be able to load this information into your tool, any of the three tools that I just mentioned, or into your database catalog or whatever it is that you're going to be acquiring or utilizing within your organization to help to make this information available to people. So let's talk about some sources of the business glossary. And again, that's the terminology. These are what we are calling things within the organization. It is the common business language and there's many places that you can go for this information. In fact, I had a client recently that built a team of pretty high-level people to go out, because this was very important to them, to build out what are the specific business terms that we use. And where were some of the places that they went to get that terminology? Well, they went to industry standards. One thing that's not shared on the screen is industry models, data models that are specific to different types of business. They have terminology in them. They may not be the terminology that you presently use, but it might be just a place for you to be able to locate terminology. There's employee reference manuals, customer reference guides, user manuals, and handbooks. You can go to your organization's website and see what terminology are we using that we want to share so we can get not only our internal customers, people that we work with within the organization and the data scientists and the analysts to be on the same page, but we also want the business community and we want the client community to be speaking the same language. You know, one of the things that I'm finding are really important right now is the self-service business intelligence within organizations actually to the to the customers where they can go and they can find the data that they need to make a proper order within the organization. So we might want to be consistent in how we're using that terminology, you know, consistent with what we're calling it on those things that are the front end or the face to of our organization to people and you can bring people together and you can have brainstorming sessions or brain dump sessions and start to catalog, you know, what are the specific objects of information. So it's not that difficult to start identifying what some of the key business terms are and again you're not going to necessarily focus on every single term within the organization but you're going to want to focus on the specific terminology that is important to your organization first and build repeatable processes and repeatable methods for being able to take that information and make it available through metadata in one of these three different levels in one of these three different tools that I'm talking about today. So where can we go to get information from data dictionary? So you could have native documentation things that you're already collecting within spreadsheets and standards and things that you have already recorded over the years and hopefully have kept up to date but you could have the data documentation in different native platforms in access databases for example within your organization. Certainly the vendors and the systems will typically when you're acquiring a system or you're acquiring a module from an organization from a vendor they're going to have some documentation as to what the data is and perhaps what the data even means. Now the question is, is that going to be consistent with how you're using that information? So it might require some changes on your part but it is a source of information that you could use to populate a data dictionary and I mentioned spreadsheets and lists of standards of terms data models are another place. When I got started in the field of metadata management that's what we focused on for first it was the conceptual information which was more along the terminology the logical physical the logical data models which was more along the attributes of the information and then the physical models which were actually the databases and how the databases were constructed across the organization. So all of these places listed on the screen now are potential sources of information to populate data dictionaries and it's very important to be very consistent in how you collect the information in these dictionaries as one of my present clients is finding out as they try to ingest information into a bunch of different tools to see which tool is most appropriate for them well they want to be consistent in how they're capturing that information so we don't have to redefine the process of getting this information into into the tool of choice you know we want to be able to set up a kind of a repeatable script within the organization to take the metadata from however we're capturing that information presently and ingesting it into the tool and then there's the database catalog and again I talk about native data documentation that you might have about the specific physical data within your organization the database catalogs are a great place to go certainly having a connector between the tool that you're using and the database catalog makes a lot of sense because that's where you're going to go to get well what is the physical name of the table what is the physical name of the data element and spreadsheets and data models you know where the physical data models are of particular interest when it comes to the data catalog sources for your organization and again we're talking about you know business and resource specific elements of information across the organization and I can't emphasize enough I know that the the framework is important and it's a big part of this session now but you know one thing that I'd really like you to be able to take away from this session is if you don't have a tool right now or if you're looking at tools or you you need to budget for them and so it's a little bit down the road as to when you're going to be able to acquire a tool you can start right now by governing that metadata so you want to capture it consistently I talked about capturing it consistently within the data within a spreadsheet you know as I said a client right now has different data dictionaries for different products that they have within their organization and they're doing their darnedest to keep those things consistent so when they write a script to adjust this information into their tool they can use that format and they don't have to rewrite the script for each source of metadata that they're going to collect and ingest into their tool so you want to manage or at least you want to consider managing and storing this information that you're collecting centrally so you have a one resource one place where people can go to get the different data dictionaries that we have about different aspects of our data warehouse or our data lake or different products and systems that we have in the organization the biggest takeaway is govern the metadata early so that you're going to be prepared when you get to that point where you bring a tool into your organization so I talked about the business terms I talked about the business attributes I talked about the business elements but these things being connected together are extremely important as well so we're going to spend a minute talking about the importance of the linkage we'll talk about the different types of links that take place even in just in the semantic model and I'm going to bring back up the semantic model here in a minute just to share with you and highlight the parts of the model that make the connections between the subject area and the terms and the terms and the attributes and the attributes and the elements so we'll talk about each of those here real quickly but first I want to talk about the importance of having the linkage so it makes it possible for people in the organization to be able to drill down from a search term that they use and navigate through the metadata to get to the information that they need to help them to perform their job function and that's one of the basic capabilities that you'll find in any tool that is claiming to be a glossary or a dictionary or a catalog is the ability to be able to navigate through the tool so these linkages are those navigations the relationships between different things within the products is something that's going to be a key piece of any proof of concept or pilot of these tools so we want to make certain that we have the ability to add objects and link them to things like for example data stewards and different levels of data stewards and what are they responsible for or owners whatever you call them you want to be able to create those objects within the tools and connect them to the things that have meaning within the organization so there's the importance of the linkage between the subject areas and the glossary terms between the terms and the attributes the attributes and the elements and not only that but as I mentioned before we have aliases for things we call something as a standard this is what we're going to call it but we actually physically call them something different and we need to be able to say okay we want to move forward with this standard moving forward but we want to also see how is this information already represented within the tools so we want to between the terms and between the attributes and I've even had a client recently who has made and built relationships in their catalog between the domains and between the terms so we want to make certain that we're focusing on those links and we have given the end users the ability to be able to navigate across these links so the first one I want to talk about is that relationship between the glossary and the dictionary and the most important thing is it enables search and enables people to find the information they can drill down from the terms they can digest different pieces of the business and find out what information and what data is available about that specific term and also you know if you follow a framework like the one that I shared with you you might also have standard names for the terms and you want to connect those as well to what these things are called within the different resources that I talked about earlier across the organization and then there's the relationships between the dictionary objects themselves the standard attribute name what are we calling it in this system how is it represented on this report and oftentimes these links between the dictionary objects are some of the most important for our organization what is an alias you know why are we getting different and for different results from reports when we ask the same question of multiple people across the organization well maybe they're going to data that they think is the same but it's actually different and how are we going to know that well we call it one thing in one system we call it something else in another system the attributes are different the values are different the dictionary objects become that common linking point across the organization and at the term level as we have connections between the terms those are really logical aliases this is not now down into the data yet these are the common aliases of what do we call things in different places in the organization so logically they may be the same or may be close there's a difference between those and the physical aliases which I'll talk about again in a second here but then there's the relationship between the dictionary and the catalog itself so once people have recognized the attributes that they care most about within the organization where is the data about that what pieces of data do we have about that so enabling the drill down from the attributes to the elements again I've already spoken about the need to have standard attribute names and standard element names now one of the common questions that we get from the business community is what data do we have about whatever it is what is the subject or about address or about market or even something as particular as the last name of a person what is it called in the different systems and you know what's the size of the field and how much room do I need to leave when I report you know they want to be able to answer that question of what data what metadata do we have about these specific objects and then there's a relationship between the items at the catalog level the standard element name what is the there's a standard name that we want to use for that specific piece of data moving forward or what's the common one that we want to at least link the aliases too that's the standard element name and then we also recognize that we call these things different things in different systems so we've got resource specific element names again it's a common linking point and these are the things that I would refer to as being the physical aliases physically within a database it's called something different so if you're running a report against a new resource and I want some I want the data about something in specific we need to be able to link from the standard element from the standard attribute to what these things are called within these different systems so these linkage all these lines that connect the different pieces of the framework that's on the right hand side are extremely important and sometimes it's easy to ingest that information in through the spreadsheets sometimes it takes a manual effort for you to go in and to collect that information and to enter that into the repository or into the catalog tool that you're using so let's talk about the processes that need to be in place to govern all the different levels of metadata specifically the three that we're talking about here and the first thing I'm going to talk about is the process just regular processes versus governed processes you know oftentimes a process itself is a form of governance but only if it becomes formal and we're associating and engaging the similar role at the similar time so a process itself is metadata if you're keeping track of what steps we're following so we'll talk about processes just regular processes versus governed processes and what does it mean to govern a process we'll talk about some processes that are specific to the business glossary the data dictionary and the data catalog and then I'll spend a second talking about building your metadata toolkit what things can you build that are repeatable so you don't have to reinvent the wheel every time you go down this path and start collecting information to ingest into these tools so the first we'll talk about processes versus governed processes and we need to do that for both data processes and metadata processes because I say all the time that the data won't govern itself the media won't govern itself we need to have people involved in it and they need to know what to do in order to capture the information about the data in the organization there's metadata processes that are focused on that data documentation to make certain that that data documentation is collected and that it is shareable across the organization then there's steps that you know do we want to redefine the steps each time that we're collecting metadata or do we want to collect in our toolkit a series of repeatable steps and then there's informal accountability oh these people know that they need to be involved at this point versus formal accountability and that's probably one of the biggest biggest items when it comes to just normal processes versus is going from the informal level of accountability to a formal level of accountability and organizations have process silos versus building consistent processes across the organization again we have a method for doing something we want to collect that information and store it somewhere within one of these tools to make certain people understand there is a process for this that's been validated that's been vetted out that's been certified that we've used repeatedly so we want to rather than just continuing to redefine processes every time they need to be created there's a consistent process that we can put in our toolkit and call out hey we did this for somebody else in the organization we can do this for you so having a repository of those consistent processes is extremely important and when it comes to a federated model of implementing governance we want to make certain that we understand that there's a lot of autonomy around the organization for people doing things the way that they've always done them and not necessarily gravitating towards a standard way of doing something so oftentimes in a federated approach we're going to define minimal standards and guidelines and things that we want to share with people across the organization and leave it up to them as to how they go out and accomplish those things but at least as long as they follow the minimal guidelines for example for collecting the metadata that we're going to ingest into the tool or into any of these tools that might be the federated approach is that you provide them a template and say put your metadata in this format we don't care how you get it into that format but at the end of the day we need it in that format in order for us to be able to ingest that information into the product so let's talk about some of the processes associated with the business glossary and you'll notice that the processes associated with the business glossary are very similar to the processes associated with the data dictionary number one you might have a process to review, validate and certify terms I spoke a minute or a few minutes ago about a client that had built a team of people to go out and build their business terminology well they also they figured that they were going to do it for a customer and provider it was a health insurance health insurance company but they wanted to be able to repeat this for different areas of the organization different terminology so they build processes to review and to validate and to certify the terms they build processes for people to add new business terms it wasn't just made available for anybody at any time they wanted to govern the information that was within their glossary so they built and defined and followed a process to add new business terms or to remove a business term or to change a business term or to maintain it to let people know when was the last time this piece of information was reviewed across the organization and if you'll notice in the dictionary processes it's very similar should I say it's the same as the glossary processes I just talked about but we're focusing on the elements and we're focusing on the attributes within the organization and not necessarily the terminology so if somebody there shouldn't be a process and I think most of you would agree that if you're going to go change something in a database it's going to have an impact on reports on other databases we need to make certain that we put a repeatable process in place to make certain that we can maintain these things and that there's a process for requesting change to specific data within a database so in different organizations they put processes in place in order to add new pieces of data or to remove pieces of data all of a sudden the data is not there and somebody doesn't have access to information they've utilized we want to know who's going to be impacted by that so the importance of creating these processes really come bubbles to the top you know we need to know that we can do these things again and again and again but when it comes to the data catalog that's where we're hearing a lot in the industry about automated capabilities how can these tools kind of use their connectors and their spiders to go out into these different products and collect the metadata that's important to them and we don't want to have to physically enter that information into the catalog I don't know of too many companies that will do that but we want to automate that capability of pulling that information into the tool versus the manual load and the manual update or semi-automates you know when you're looking at these types of tools you're going to want to look at the automation that's an important piece of ingesting information into the tool and if you're not this is not information we're housing in spreadsheets this is information that's inherent within the meta models that represents the metadata and the data that's collected within these tools so think about automating wherever you can the ingestion of the data catalog information into the tool and you may have a specific process for that and you might set up change management so that when there's changes to databases and there's changes to reports that we hear about that and we know to update the metadata accordingly so that we're not providing the wrong information to people across the organization and as we do as we build these repeatable processes they become part of your metadata toolkit so the other things that are parts of your metadata toolkit is having a repeatable place for people to collect the metadata again for example you know having a similar spreadsheet that or the parts of the organization use to document the data within a specific data resource repeatable processes for defining what metadata we're collecting producing that metadata and using that metadata you know to define accountability who is responsible for deciding what metadata we're going to use across the organization who's going to be responsible for producing that metadata do we build it into their job function how can we assure that when we add for example new data to our data lake that it's not just out there and only a few people know about it especially if we've gone through the process of making certain that that data in that tool is important is valuable so defining the accountability is part of the toolkit including who does what and when they do it that's really the way that's the formal process versus the informal process repeatable metadata ingestion processes being able to create one script to grab this information and pull it into the repository so I mentioned change management and validation I mean those are extremely important once you've loaded a data dictionary into your repository into your catalog it's a snapshot of a point in time and we want to make certain that we're thinking about well if that changes how are we going to reflect that change within the tool so metadata change management is critical I learned the hard way when I was a repository administrator many many years ago where we load in information into the repository seven days a week 24 hours a day but we hadn't built the change management so let this be kind of a warning to you or that you want to make certain that when you put these tools out there and you make them available to people that you're putting those change management in processes in place before you share this information so when we talk about the different processes again going back to the framework that I shared with you earlier all of these blocks on the screen within the framework have processes for defining, producing and using this metadata and we need to document what those processes are and be consistent that is what is really termed as metadata governance is making certain that we have people that are responsible for defining, producing and using the metadata that's going to be utilized by the organization but I'll take it one step further and I'll say that every one of these links the link between the subject area and the term the link between the term and the attribute and between attributes that are in different parts of the organization there needs to be a process in place to define what relationships are important to produce those relationships and to help people understand how they can use that information across the organization so the last subject I'm going to go through this kind of quickly is institutionalizing policy to show that you have quality metadata at all levels so the first thing is when is a policy necessary recognizing when it's necessary and then I want to share with you two ways that a policy adds value and reasons why you might want to have a policy the details of what goes into a policy and how they are all focused on executing authority and formalizing accountability my two favorite definitions for what governance is and what stewardship is executing authority is governance and formalizing accountability is stewardship and then I'll spend a minute talking about institutionalizing these policies so when do we need a policy within our organization well you know you could tell us when when is the policy necessary are you a policy driven organization are you policy centered that that would could dictate whether or not you need to have a metadata policy for your organization the policy is necessary when you need to define the procedures define the accountability to demonstrate that your leadership supports metadata and you'll see on the next slide I want to focus on that key bullet right there the demonstration of the leadership it's one of the reasons why organizations put policy in place so we can help them to understand what is important because your executives aren't going to sign off on something that's not important if they're signing off on a metadata quality policy or even a data governance policy that's representing that they are interested in that and that they feel that it's important enough to have policy about and then a policy is necessarily when you want to formalize the execution and enforcement of authority for anything across your organization and metadata would be one of those things so this slide and I don't think I've ever presented this this way before is there's really two ways that a policy adds value for your organization the first one is the one everybody thinks about articulating the process and the guidelines and the accountability for doing specific things across the organization that's the one that you know a lot of the sample policies that I've sent I've seen articulate the process the guidelines and the accountability but the most important perhaps is the demonstration of your leadership as to that they are behind this a data governance policy that there's going to be a policy around data governance or data management and that you expect it to be signed off at it's at the highest level of the organization they're not going to sign off on it unless they feel confident that this is important enough for your organization to have policy about so this includes metadata policy associated with all the different types of metadata that I've talked about today so articulating the process and the guidelines yeah that's important demonstrating leadership support sponsorship and understanding that's just as important and you know that the policy is really where the buck stops when it gets to the executives they're the ones that are going to tell you whether or not a policy is going to be necessary and if they're going to support it that really demonstrates to the organization that these are really this is something that's important consideration for our organization so what goes into a metadata policy I wanted to share some of those things a purpose and a scope you know who's responsible for what you know definitions of terms that you're using within the policy it's always good to have a list of what those definitions are what policies does this policy relate to what are the procedures the exceptions the enforcement when does this policy been revised that in itself is kind of a good straw man for what should go into a policy of any kind and especially when it comes to a metadata policy we need to be able to make certain that we've defined who does what and when do they do it so I always talk about my definition of data governance being the execution and enforcement of authority over the management of data and as stewardship as being the formation of accountability this is really the governance of the metadata you want to execute and enforce authority over the metadata in your organization because it's not going to create itself and the stewardship you would need to have formal accountability people need to have formal accountability in order for this metadata to be collected so the idea is stay as non-invasive as you can while you're doing this but do try to move from whatever informal setup you have if your metadata is an issue in your organization to move to a more formal setup establish the policy as kind of the norm for your organization socialize the concept of the fact that metadata needs to be governed communicate this across the organization in orientation onboarding and ongoing types of media that you use to conduct your communications formalize your stewardship for the metadata and for the data and measure and evaluate and enhance this information over time so in this webinar I don't know it went kind of quickly at least it felt that way for me I shared the three levels of metadata and how they differ the sources of metadata at each of the levels the importance of the linkage between the levels in this semantic model the processes that you should be considering in order to govern all levels of the metadata and then thinking about using a policy again to demonstrate executive support articulate executive support for the need to govern this information because again it's not going to govern itself so with that I'm going to turn it back to Shannon to see if we had any questions today thank you so much for another fantastic presentation as always and just to answer the most commonly asked questions just a reminder I will send a follow-up email to all registrants for this webinar by end of day Monday with links to the slides and links to the recording as well as anything else requested you know we had a lot of great questions coming in to dive in here how can metadata help in digital transformation of higher education that's kind of specific the digital transformation of higher education well I know from experience of working with organizations in higher ed that they haven't really traditionally done a great job of collecting information of collecting the metadata that doesn't seem to be the business that they're in but more and more these days the higher ed organizations are seeing that data is an important asset the information that they have about the data the analytics that in order to improve analytics that they become important so I think it becomes inherent in them when they're recognizing that data is important to higher education just like it's important to any industry that we need to put formality in place around the metadata associated with higher ed data or whatever type of data your organization has so that is the movement of the future in higher ed as higher ed becomes more and more competitive what we know about our students what we know about our faculty what do analytic and where our students are coming from now we want to be able to do analytics on that specific data so it just increases the importance of having this type of information for people across the organization have you seen any organizations crowdsourcer catalog similar to Wikipedia format there are some tools that have kind of that focus to it not specifically are like Wikipedia in Wikipedia you can jump between subjects you can jump all over the organization but that takes a lot of work I mean those links again are not going to connect themselves they're not going to the hyperlinks are not going to connect themselves so I guess the easy answer or the simple answer to that question is I haven't seen too many organizations provide their catalog in a Wikipedia but that's what so many of us are used to now when we're looking up information even though Wikipedia isn't necessarily the know-all and be-all source for us to go to if we can strive to that if we can customize that for our organization then I would say that is something that we should be considering and perhaps there will be a movement of tools in the future to look more like Wikipedia than the kind of standard graphical database or just relational databases that they're presently using so I haven't seen it but I wouldn't be surprised if somebody out there is providing that type of a product and but what are the challenges and opportunities of using different level of metadata and what are the advantages of using all levels in business organizations well you know I think that the business people want to know what data they should they're using and what data is available to them and so the biggest advantage of doing it in the levels is that you they're more like bite-sized pieces of creating this for the entire organization you may start with your business terminology you may already have data dictionaries for specific applications so now the business value that comes from this is it makes information about the data available to people that are going to utilize it to perform their job function as I mentioned earlier so that is the biggest business value is that we're taking the data that's always been thought of as being it's responsibilities and we're putting some responsibility on the business community to help them to understand the data better and to better leverage the data we put a lot of money and investments into building analytical platforms and data lakes and data warehouses and the real return on investment from these investments comes from people using the data and one of the biggest complaints that I've heard and maybe all of you have heard is that you know that people don't know what data is out there and so if you're looking for business value it is connecting the business to the data through the information that we have available about the data and the glossary the dictionary the catalog are the wave of the future and they've actually been around for a while but it seems to be some of the most important buzz words and most important topics being discussed in sessions like this and that diversity in general and I think we have time to slip in one or two more questions here and keep the questions coming I will send them over to Bob after the webinar for him to write up the answers which will be included in the follow-up email so Bob early on you talked about market code is written as and it's written as business attribute and element what's the difference well you know and that's really that's specific to the organization so market code was listed as being a business term what do we mean by the market market code and maybe even that code is made up of multiple pieces of information there's the code itself there's the description of the code it's who entered the code it's who updated the code when was the last day that this was updated what was the source of the information from the code so market code might be more of a general term and when we get down into the business elements those are the specific pieces of information that we have about the market about the market code I'm glad you asked that question because that is one of the first questions I typically get from sharing that framework is you have them listed as being the same thing but one can be considered a term you can have a different name for it or use the same name but that was a good call out and typically there is multiple pieces of information available about any specific term Bob thank you so much for another great presentation but I'm afraid that is all the time we have for today again lots of great questions we didn't get a chance to get to so keep them coming I will get them over to Bob to be included in the follow-up email by that will go out to all registrants by end of day Monday with links to the slides and the recording I also include links to all the matrices and everything that Bob provides as well as a link to his new online training available on duty catalogs dictionaries and glossaries so thanks everybody for the great participation as always thanks Bob and I hope everybody stays safe out there thanks all thanks everybody take care