 Hello and welcome, my name is Shannon Kemp and I'm the Chief Digital Manager of Data Diversity. We would like to thank you for joining the current installment of the Monthly Data Diversity Webinar Series, Real World Data Governance with Bob Seiner. Today, Bob will be discussing Activate Data Governance using the data catalog sponsored today by data.world. Just a couple of points to get us started due to the large number of people that attend these sessions, you will be muted during the webinar. If you'd like to chat with us or with each other, we certainly encourage you to do so and just to know Zoom chat defaults to send to just the panelists, but you may absolutely switch that to network with everyone. For questions, we will be collecting them via the Q&A section or if you'd like to tweet, we encourage you to share your questions via Twitter using hashtag RWDG. And to find the chat and the Q&A panels, you may click those icons in the bottom middle of your screen to activate those features. And as always, we will send a follow-up email within two business days containing links to the slides, the recording of the session, and additional information requested throughout the webinar. Now, let me turn it over to Stuart for a brief word from our sponsor, data.world. Stuart, hello and welcome. Of course, I talk on mute right away. Hi there, thanks for having me. All right, let's get into it. I think this is going to be a really interesting hour today. And hopefully I can provide you guys a really maybe a provocative appetizer to what Bob's going to talk about today as you guys think about your agile data governance initiatives or your data governance initiatives. And I think we're going to talk a little bit today about our really what we've had very a lot of success with our data catalog platform with more agile data governance. I'm going to talk a little bit here in my in this first part about how we do that. Right. So let's jump in. Hi, everyone, I'm Stuart Kerver. I'm the director of sales engineering at data.world. I'm coming to you live from our headquarters in Austin, Texas. We are an enterprise data catalog for the modern data stack is one of the tag lines we like to say we are fully cloud native SAS, which means we deliver continuously. No release cycles, no versions, very modern platform based on knowledge graph, which for many of you data people out there, you may know knowledge graph. Again, that's the under the under the power underpinning things like Google, Facebook and many other Titans of the Internet age. So and it's a great technology to help enable data discovery, data governance, data analysis and many other things. And we work with dozens of global firms all the way from the monsters all the way down to we have a number of customers, for example, in state and local governments, so willing to work with all types and sizes. And so when we talk about data governance, you know, our real view on this is what it has been is often, you know, about risk and avoidance, right? For if many of you have been in data governance for a long time, right? This is a lot of the places that it started prevention, right? Of risk and it often involved top down policies, multiple layers of councils, structured workflows, right? Many of those those parts and pieces and it can be very hard to get people to adopt data governance because it's often stands a bit in their way. They see it as an obstacle. And so what we feel like it should be and what we really view it is and what our platform is designed to do and how we talk about our implementations is it's really the rules of cooperation and collaboration. Of course, we need to do it responsibly. We're not crazy, but it's really the process of how do we as an organization do data analytics often together, right? So thinking about how do we work together to do that? And when we say we usually mean data producers that have the data and data consumers that want that data. And then how do we capture knowledge in real time? One of the prevailing things about data governance over the last set of years has just been pure metadata, which tells you a lot about how things should be and how things are, but doesn't necessarily tell you, you know, how that what the data work in progress looks like. And that's where kind of really agile concepts come in. So that's just a bit of an opener. So what is agile data governance to us? You know, if we want to get real, real definitional about it, it's, you know, the process of creating and improving data assets, you know, iteratively working together to capture knowledge as you go, you know, and working together for the benefit of all. And that there's a bunch of things that we can talk about beyond this. But really it's based on, you know, agile principles, open software development principles. If for those of you who've kind of been in the tech world and understand those. So really a lot of interesting, you know, use there about how do we work together in sprints to work effectively to not boil the ocean and get just bogged down in kind of waterfall process? And so what do we see? We see data governance as wanting to be an accelerator, not a barrier. We almost think the term data governance is almost getting particularly frayed in, right? We think about like almost like data product managers is almost a more modern way to think about that. So on the top, you see, you know, what a lot of people's perception and a lot of people we talk to in the market, their perception of traditional data governance is that there are barriers to entry, right? Because that's always been thought of as the responsible way to do things. But the majority of the data at the majority of companies is often data that people should be able to access for their day to day work. It just needs to happen responsibly. And so thinking about how do I accelerate? How do I make it so that because there's risks to companies beyond simply access to the wrong data? There's there are risks in being slow and plotting along while your competitors are moving very, very quickly and being data driven organizations. So when we think about Agileta governance, that's about, of course, that data discovery access, you know, kind of that data product marketplace with what data sets are available to me to answer what kinds of questions and how do I get data to answer different questions? We want cross-functional collaboration, also reuse on a very undervalued word in the data governance world, in my opinion. Reuse of data products is by far the most governed way. It's the safest way for things to happen and constantly reinventing the wheel in slightly erroneous ways over time, right? We have all seen that happen. So reuse, it can only be done if you can see how the data was produced. We can understand exactly what data was used and how and why. And that's really a core piece of our platform. And so you can see cleaning, normalizing, transforming, joining data, novel ways. We have the ability to do data virtualization and data query as well from our cloud native platform. So lots of interesting things. Again, if you guys want to demo from us at some point, I'm happy to show you in the deeper session. So one of the things I wanted to throw out to you guys to picture is like an Agileta Sprint, right? What is, you know, what is an Agileta Sprint? What would that look like? Right. And how do we how do we work together as data producers and consumers and how do governance people, you know, really shepherd that process of collaboration rather than stand in the way of it, right? Especially from a perception standpoint. So what you see here is thinking about a sprint is like identifying questions. You can even have backlogs of requests and things. So thinking really about business value first, right? Identify the question. I need to move our business forward by getting information about our customer behavior. But in an anonymized way so that I don't, you know, run into any particular privacy concerns, how can I do that? Right. That's a question from consumers that the consumers and the producers can agree to terms on and then begin to run a sprint to say, how do I produce a data product to answer that question? Those questions can be very specific or much more broad. But we think the more specifically driven to value they are, the better that gets. So what you see here is the ability to, you know, let the producers curate those data assets, maybe even virtualize connection to some data in Snowflake or BigQuery or SQL server and be able to say, here's what you need to know about this data set, your queries that I use in this data set. Here are some definitions of terms that you need to be clear on as you use this data. Here's what we meant by that particular metric. And so being having that specific package of both data and metadata together and then enable the consumers to go in and really work with it, right, to read it, they can trust it because they know that we've talked about this question and that they, you know, they can really validate that it works for their purposes, they can do their analysis and they can answer that question, right? And you can close that sprint. And then for the people who are looking, right, we provide full measure and audit capabilities of every action that was taken, every question. We have full collaboration ability so you can post questions and suggest changes and do all of those things. And assist people. So it's really, you know, we just want to change the frame of mind. And many of our customers agree to like, how do we collaborate in an effective and responsible way? So food for thought, they are all stuff that we are happy to talk about lots on the data.world website about these concepts as well. So hopefully you guys find that interesting. So a couple of quick takeaways. Agile data governance, you know, is really drives that is driven by data democratization. So thinking about how do we move forward as a more data driven organization with more people empowered to use data? We want to really, really hardcore focus on business value and not completeness, not every table has to be known, not every one is valuable, right? But how do you focus on the value that people need and that they need now or very soon after now? And, you know, establish a scalable iterative process where you can teach other teams how to run these sprints. And, you know, and and fill their own backlog with ideas and and parse through this. So really, you know, just some interesting ideas. Something I think is very cool about data.world and some of the ways we approach these problems. So if you want to know more, we have this. I don't usually care too much about things like playbooks and stuff. This is a really nice piece. If you are a data governance professional to take a look at, it is available on our website called Agile Data Governance Playbook. It's got a lot of good detail in there. It's not one of these like flimsy one pagers. And it talks about both practical and conceptual ideas in there. I think a lot of you might find it cool. It is free to download. So and it's nicely put together and has graphics and stuff. So, you know, I highly recommend you guys check that out. Again, our URL is pretty simple. It's data.world and so you guys can find it there. All right. Well, I appreciate your time. Again, my name is Stuart Kerver. I am very easy to reach if you have follow up questions or want to chat or want to see a demo, want to do those things. My name is Stuart, I email a Stuart Kerver at data.world. Very easy to remember if you can just remember my name. All right. I appreciate everyone giving me the little interlude. I see there's a bunch of things in chat. I don't know if they're for me or not. But if they are, I will I will be around during the session and I will try to answer as many of those questions as I can. All right. Thanks for giving me the time and I'll pass it back to Shannon. Thank you so much for this great intro and thanks to Data.world for sponsoring today's webinar and helping make these happen. And if you have questions for Stuart or Data.world, feel free to submit them in the Q&A panel. As Stuart will likewise be joining us for the Q&A portion of the webinar at the end of the session today. So let me introduce to our speaker for the series, Bob Seiner. Bob is the president and principal of KIK Consulting and Educational Services and the publisher of the data administration newsletter, Tdan.com. Bob specializes in non-invasive data governance, data stewardship and metadata management solutions. And with that, I will give the floor to Bob to start his presentation. Hello and welcome. Hi, Shannon. Hi, everybody. It's really great to have you with us today. Thank you, Stuart, for a great presentation. I loved it when you started to talk about a little bit of provocative content for people, people thinking about things in terms of agile data governance. There's a lot of things that you talked about that I'm going to kind of reiterate in my session today when you talk about empowering people. And that's really what we're talking about when we're activating data governance. And we're going to talk about doing that through the stewards of the organization. I'll also talk a bit about, you know, using a data catalog tool to to empower people and to activate people. And that's really what makes the difference between an active and a passive data governance program. Before I get started, just want to run through a quick list. It may not look like a quick list, but I'll go through it quickly. A list of the things that are that I'm presently involved in and that that you may be interested in that are coming up. Of course, there's this webinar series in next month. I will be talking about how data management is data governance. And that's always an interesting topic. And I have a special guest, a good friend of mine, Anthony Ogman will be joining us for that webinar. I talk a lot about noninvasive data governance. So if you're interested in learning more about noninvasive data governance, there is a book that I wrote that's been out for several years now. And there's some information about how you can find that. I'll be speaking at a couple of diversity events coming up shortly. The first one is now a virtual event, Enterprise Data World, taking place. I might have those dates from. I think it's actually April 19th and 20th instead of 20th and 21st. You could check on the dataversity.net website to learn about that. I'll be at the Data Governance and Information Quality West Conference in San Diego in June. There's a couple of learning plans, online learning plans that are available through Dataversity, one on noninvasive data governance, one on noninvasive metadata governance. And the most recent one is on business glossaries, data dictionaries and data catalogs. As Shannon mentioned, I am the publisher of the Data Administration Newsletter. Just put out a new issue of the publication yesterday. And if you don't know about it, please go out and take a look at it. There's a lot of great content for people, people like yourselves who are interested in data, data management. Of course, there's KIK, Consulting and Educational Services. That's my consulting and education business. It is the home of noninvasive data governance. And then last, something that I've recently added to the list of things that I do is I'm an adjunct faculty member at Carnegie Mellon University here in my hometown of Pittsburgh in their Heinz College Chief Data Officer program. So I'm very happy to to share with you, oops, to share with you the other things that I'm actively involved in. The items that I'm going to talk about today, again, Shannon said that the name of the session is Activate Data Governance Using the Data Catalog. So the first thing that I'm going to do is I'm going to talk about the comparison between what's active and what's passive data governance. I'm going to talk about what it means to have an active data governance program in your organization. A lot of organizations are talking about trying to move from these more passive. I think I think what Stuart had said is what data governance has been to what it should be. So they're looking to make their programs more active. And so move from a passive to an active program. We'll talk about how a data catalog tool can be used to activate your program and to activate specifically the data stewards in your organization. Then we'll talk about the role a catalog plays in data governance. And the last thing I want to talk about is how the metadata that you're going to store in the data catalog, it's not going to govern itself. It requires that you activate stewards, even metadata steward who are defining, producing and using the metadata even within the organization. So let's start by defining what I mean by an active data governance program. And if you go and you do a search on the Internet, you're going to find different definitions for what an active data governance program is. And I'll share one from somebody else here in a minute. But typically an active data governance program involves governing activities being built into what people do. It's the I've always joked that if data governance is considered a game, you've won the game when it's basically just comes naturally to people in the things that they do. So in an active data governance program, people are governing data as part of their daily work efforts. In an active program, people utilize a catalog or whatever tool you use to provide documentation about the data. And that that data catalog is there to assist people with their jobs as well. Oftentimes, I know I refer to data governance as being people governance because really what we're trying to do by activating our governance program is to get people involved. So it's really not the data per se that is being governed. It's more the people's behavior and how they're going about defining and producing and using data as part of their jobs. In an active program, people aren't assigned to be data stewards. That certainly sounds a lot more invasive than the noninvasive approach that I talk about, but they're recognized as stewards based on what they're already doing with the data, already defining, producing and using data. And we're holding them formally accountable for these actions that they're taking with the data. And I often say, and in fact, I had a webinar recently about the fact that how can everybody in the organization be a data steward to go to diversity, look for the recorded webinars and look for the one on how everybody potentially is a data steward. And that is one of the key ways that you're going to turn a data governance program from being passive to being active. So let's define what we mean by being passive. So oftentimes in a passive program, if people are doing anything in terms of data governance, it feels like it's an add on to their job. So the activities that they are applying that are governance, you know, this is a passive program. If we do that, then we've got to actually step forward and do something. So immediately it feels as though it's over and above people's existing work efforts, oftentimes people think of it in terms of the data as what's being governed rather than people's behavior. And as I've often said, the data is going to do what we tell it to do. So if we define it well, we'll be providing a good definition for people to understand the data. If we're producing it properly, you know, per the definition, if we're using it properly, that's really the people's behavior associated with the data. People are identified as stewards in a passive approach, maybe less than being assigned. But at least, OK, we'll identify people as being data stewards. It doesn't mean necessarily that they're being held formally accountable for what they do. Oftentimes in a passive program, data stewards are assigned or they're identified, OK, we're going to tag you as a data steward. We're not necessarily going to actively tell you what it is that we're expecting from you, but you're a data steward, so start acting like one. I think we need to investigate that a little bit further. So I wanted to share with you and this is fairly representative of what you'll find on the Internet when you're talking about passive versus active data governance. And, you know, I actually think this is less of the how and more of the what. And so when we're talking, and I want to talk in the webinar today, mostly about the how. So when we talk about passive data governance, or at least in the way that this organization talked about passive data governance, it talks about organizing data in silos, having constant, constant manual actions. It reconciles the data instead of trying to solve the problems or adjust the processes or the rules associated with the data. Again, it's very siloed in its view. This is more of a result of passive data governance rather than active data governance. And so I like the things on the right hand side of this this diagram. It talks about unifies the standards, continuously builds and updates and monitors, fixes the processes. These are all activities. These are all action verbs that they're using to describe what's taking place. So I think this diagram shows a lot of the what, but it doesn't necessarily talk to the how. How are we going to activate data governance using the data catalog? How are we going to activate data governance using the data steward? So let's talk a little bit about what are some of the advantages and the disadvantages of active data governance versus passive data governance. So some of the advantages of active data governance is that if we can get data governance to be built into what people do, that it's going to become very natural to them. It's not going to feel like anything that we're asking them to do is over and above what they're presently doing. It's going to be a lot less invasive if we take if we apply an active approach to governance. You know, I talk about the stewards being recognized being recognized for what they do with the data has the word recognized has a positive connotation. We're going to apply governance to process rather than redefine all of our processes or call everything a data governance process or have a single data governance process. Now, we're going to take the processes that we already have that may need to make adjustments to them, but we're going to apply governance to the process. We're going to say who gets involved when and how they're going to get involved. Accountability in an active data governance program is more formalized and governance really becomes front of mind. People are thinking in terms of this is the things that we need to do to improve our data situation within our organization. So what are some of the disadvantages of taking an active data governance approach? Well, the first one is that if people think of it, if you use the terms active, there's going to be a period of adjustment. So people are going to need to get used to seeing themselves as data stewards and start asking, well, is this something that data governance can help us with a lot of those things that are necessary to stand up a program require some period of adjustment? And we need to make certain that we're going to describe what we mean by active data governance. So active needs to be well defined and communications. You know, one of the distance. So I think that's a disadvantage to active data governance is because we're going to need to tell people what we mean by active and the communications is critical. So there's going to be a lot of communications. I don't know. You can kind of view that as a disadvantage, but it's actually in a passive program, as you'll see here in a minute, communications doesn't take place as often as actively within the organization. So what are some of the advantages and disadvantages of passive data governance? Well, passive appears to be less threatening. But in that in that sense, we also need to define the term passive. What do we mean by it being a passive data governance program? To me, when something is passive, it is it's not really adding value. It's just kind of there. We're taking a passive approach. The change in the behavior becomes a little bit less obvious when we talk about passive governance. But what are some of the disadvantages of passive data governance? Well, one of the disadvantages is that you become complacent here. This is the way we always do things. This is the way we're going to do things moving forward. If we get the people of the organization to start to become active in using the data catalog to help them with their job and to actively participate as data stewards, it's less complacent. It's more resolute. It's more intentional in the ways that people are governing data. The disadvantages oftentimes in a passive program, people are going to be assigned data stewards to be data stewards. And that immediately feels over and above what they're presently doing. Oftentimes, there is a single governance process that we need to be thinking about or, you know, in a passive data governance program. Everything is called a data governance process or or there's one process that we try to apply to all the different things that we are trying to achieve with data governance. So I think it's a disadvantage when you think about a passive governance program, if you're going to have what you refer to as your data governance process, I would say that almost any defined process that you have, it's a form of governance in itself, so we don't need to to call them governance processes in a passive program. Accountability is not as formal as it is in an active program. The communication may not be as deep and people may not recognize themselves as being data stewards, especially if you're only assigning a certain number of people in your organization. And governance is basically an add on by describing it as being a passive program. It is it basically feels like it's optional. Again, it goes back to the term complacent data governance. And so if we're putting a data governance program, it's going to add value in place at value as what is it is achieving within the organization. We are going to need to activate the stewards. We are going to need to activate some tool be a data catalog to help people to understand the data better, to have that marketplace of data that was just talked about prior to to my part of the session here today. So there are certainly advantages and disadvantages. I would say that the disadvantages associated with passive data governance and the advantages associated with the active data governance are something that you really might want to be considering within your organization. So how do you determine what the best approach is for your organization? Should we take an active approach? Should we take a passive approach? And I wanted to, since I have the word right in quotes there, there's not one answer. I mean, maybe you're going to have a hybrid active and passive program. That might be somewhat difficult to achieve. But, you know, you want to see what is the culture like in your organization? What is the senior leadership of your organization? And what level do they support, sponsor and even understand what it is that you're doing? And there is a difference between being active in our approach and more passive in our approach. And when you go about selecting how you want to, if you want to activate or make your data governance passive, data governance program passive, you really have to determine what is the appetite for change. If you're going to make it more active and you're going to get stewards involved and they're going to know that they're data stewards and they're going to know that there's a tool available to them, it's going to require some change within them and within the things that they're doing within the organization. And so you're going to match the program to your organizational culture with the willingness to articulate that they have, that they're ready to accept new learned behavior, understanding the motivation to use new tools. You might want to investigate each of these items when you're going to try to figure out whether or not you want to do an active or if you want to take a more passive approach to data governance. So again, the benefits of an active data governance program and the people of the organization recognizing themselves as stewards and having a tool that's available to them to get access to information about the data, the data and the metadata are governed more formally and because they're governed more formally, they're governed more thoroughly. People recognize themselves as data stewards, communications is regular and repeated. In fact, I think I mentioned in a recent webinar, I talk about communications, communications, communications, and when you're done communicating, communicate a little bit more just to make certain that it kind of again gets built into what they do. People are active all the time in terms of being stewards and governing the data. Governance becomes part of people's job and they get used to using a catalog tool. The use of the catalog tool becomes kind of natural to them rather than something that is being imposed upon them as something that is more than what they are presently doing. So if you're going to take an active approach to data governance, here are some of the things that I suggest, the actions that you can take to implement an active governance program, provide an actionable tool. We'll talk about a data catalog here a little bit more in a minute or so, but recognize and record who your stewards are. So it's not just a matter of having a conversation. Oh, yeah, I noticed I'm a data steward, but really record that and put that information to use somewhere. Well, where are you going to put that information to use? You're most likely going to record that information within a data catalog tool. You're going to communicate effectively with not only senior leadership and your middle management, but with the operational data stewards of the organization. And you're going to educate the stewards on an ongoing basis on how to leverage the tools that are being made available to them and how to participate in active governance using the tool. So you're going to engage the stewards and I talk about there being really three actions people can take with data and that is definition, production and usage of data and everything falls under one of those actions. Well, the same thing holds true with metadata. As I say a lot, and I'll kind of wrap up this session by talking about how the metadata does not govern itself. You need to have people that are actively defining what metadata is necessary for the organization and putting definition to that metadata, producing that metadata. Oftentimes the metadata doesn't produce itself. Certainly we can automate things and pull metadata from our native tools into our data catalog, but production of metadata has to be another way that we're engaging the stewards and then getting them used to using the data catalog or the tool as part of what they do. And so when we talk about the roles of the data stewards in an active program, we're going to get people in the organization to define data in a formal way. We're going to get them to produce data in a formal way and we're going to get them to use data in a formal way all the while holding them accountable for the way that they're defining, producing, and using data. And if you consider that anybody in the organization who either defines and or uses or produces data as part of their job, if they're being held formally accountable for how they're defining, producing, and using data, they're data stewards. Those people in the organization that are expected to protect data that is classified as highly confidential or anything that's considered to be sensitive data, they're stewards because they are people that are being held formally accountable for how they're using the data. And oftentimes a lot of communication goes into helping them to understand how the data they use is classified and how they have responsibility, active responsibility for making certain that that data is being protected. And another way that the role of the stewards the role of the stewards in an active program is to apply governance to process rather than, like I said before, redefining everything as a governance process. So we know that the stewards need to be actively involved in the program in order to really make it become an active data governance program. But now let's talk about the role of the data catalog in making a program an active program. Well, we need a place to be able to document the definition of the data. In fact, you may have the same data that's called the same thing to find multiple ways in your organization. Certainly, you want to let people know that, well, if you're going to ask me the question of what our total revenue is, which definition of total revenue do we want me to use? So we need to make certain that we're documenting the data definition in a formal way. And the place that we're going to do that and the way we're going to make it available is by storing this information inside a data catalog. The data catalog helps to make the program active when it is a place where you can document how the data is being produced and how the data should be used. So documenting these things so people have, we're raising confidence that people have in the data and we're raising confidence in how they are handling the data. Now we need to be able to document how the data is being produced, what the most effective way or what the real legal way is to use data. So we're going to need to document that somewhere and that's the data catalog. We're going to need to make it available to people somehow and that's going to be through the data catalog as well. And if you have a data catalog, that really becomes the place where you're going to hold data stewards formally accountable because they're going to be reported. They may be involved in specific workflows, but we're going to start to get those stewards to do the things that they're being held formally accountable for, but we're going to help them to understand what they're being held formally accountable and why and how it really plays into the things that they're already doing as part of their job. So the data catalog, the role that it plays in an active program is it is that resource that improves confidence in the data. It gives the stewards a place to document how the data is being defined, produced and used. So without having a data catalog, and I'll share this with you in a minute, all the different places where there's metadata within the organization and what issues might it cause if we need to give people access to that metadata in those native tools. A data catalog is a lot better solution. It's kind of like building a data warehouse of metadata. It gives them a one stop shop for them to get the information they need. The data catalog becomes really active when it becomes a tool of your governed processes. And by a tool of the governed processes, what I mean is people will understand the data and the actions that are coming as input into a specific process. They'll be given a place to be able to work with that data, define that data as the throughput of the process. And then it helps with the output from one step to another. So we can use a data catalog to provide those types of artifacts, that type of information that's going to help the tool, help the process to become better governed. The tool becomes a rallying point for communities of interest. I've seen organizations that kind of build up certain pages within their catalog tool that the customer work group or the product work group or the locations work group that they get involved as a place that they can rally and come together with other people who are like-minded. It becomes a place where the value of the data and the discernible value is being recorded. It is a place that you're going to basically engage stewards in the different processes. And many of the governance tools and the catalog tools that are out there are very heavy on workflow management. So getting those stewards to do the things that they need to do when they need to do them. So the workflow is important. So I really am talking about not only activating your program using a data catalog, but I talked initially about how to activate a program by using your data stewards. So get them involved in data definition activities. And that includes the glossary, dictionary, business rules. Get them involved in the modeling of the data from conceptual to logical to physical models. Get them involved in the validation of data. How does the data compare to the standards that we've defined or to certified data that is leading into your analytical platform or your data warehouse or your data lake or wherever the data needs to be certified for people to have confidence in it. And get the stewards actively involved in rationalizing the metadata, connecting your business glossary to your data dictionary. A lot of that is activities for stewards. A lot of those activities focus around the definition of the data. So the same thing we could say, we want to activate the stewards in terms of metadata production and data production as well. And what are some of the actions that they can take to become activated? They can get involved in the inventory of data. I can't tell you how many times I've spoken to people in organizations that we don't even know what data we have. The data catalog becomes a perfect place for housing that information. We want to get them, people involved activating the stewards through the entry of data, the acquisition of data, certainly in the quality of data, the mapping, lineage transformation. These are all production activities. And we want to get the stewards actively involved to make certain that we're addressing these things so that we can demonstrate value to the organization and we can kind of get off our bums and really start to activate the stewards and use a tool like a data catalog to be able to record the outcomes of these types of activities with the stewards. And then when it comes to data usage, we want to activate the stewards through helping them to understand how the data is classified or helping them to help us to classify the data and what it means when data is classified a certain way and how that data needs to be handled. Activate the stewards through the data usage through analytics and visualizations through the report production and the distribution of that information across the organization. Get the stewards active in the standardization and the literacy when it comes to how people understand. And I love when Stuart talked about data democratization is putting the data into the hands of the people. That's really what data usage is all about. So we want to activate our stewards to do all of these things. And we can do that through the data catalog and use that data catalog as kind of the means to get the stewards to become less passive, get them active in what we are doing as an organization. So think of it in terms of this, without a data catalog, where are we going to store this type of information? So I say here, data definition metadata can be made available in its native source. And I'll share with you some of those native sources in a second, but think about the issues of doing that. Think about maintaining the data definition metadata in spreadsheets and data modeling tools in all these different places. And let me share with you kind of examples. Having business glossaries and data dictionaries that are all independent and they're not connected in any way. In your data modeling tools, your reporting tools in unstructured data documentation. And so that could be either read as unstructured data documentation. So the documentation about your unstructured data or what's typically happening within most organizations right now, that is we don't have structure around our data documentation. We can give people access to where the metadata or where the data is being produced. And think about, again, the issues that are associated with getting people to learn how to access the metadata in these different spreadsheets and in these different tools. And what are some of the places where data production metadata is being kept presently in your ETL tools, your data movement tools, in the queries and the procedures and the SQL statements that you're writing to move data from one place to another in your data quality tools. And again, in the unstructured data documentation, it's not being created so that it can effectively be used by people throughout the organization. So what I'm saying here is just think about if you're not going to have a data catalog as part of activating your data governance program, how are the people going to get their hands on all of this information? They're going to need to learn to read SQL to see, oh, well, what steps did we take to move certain data from one place to another? Same thing about data usage. We can give people access to data usage tools and metadata through the data usage tools. But again, think of the burden that's going to be on somebody. It's not going to happen on its own. The burden of keeping that documentation up to date, of maintaining that data usage metadata in its native source, native source, and then think about giving people access to the business rules tool, the data standards tool, the classification, the reporting, and again, it's all unstructured data documentation. It's not being created with the intention of getting this information about the data into the hands of the people that need to be active in our data governance program. So you've really got two alternatives. You can store it in its native places or you can utilize a tool like a data catalog to create that centralized data warehouse of metadata and make that information available to people. So they've got that one-stop shop, that marketplace, for learning about the data that they have within their organization. So think about activating a data governance program without having a data catalog. And I love that picture and I can just hear the Wicked Witch of the West saying, I'll get you my pretty. The options are not pretty if you're going to try to activate your program without having a centralized tool. You might want to start that way, but eventually you may get to a point where it's really not a pretty situation and people don't have access to the information they need. Think about the fact that governing data without metadata, it's virtually impossible. We need to have information about something in order to govern it. Can you govern your finances without information about your finances? Can your organization govern the people of the organization if they don't have information about the people? Well, the same thing holds true for data. We can't govern data effectively if we don't have information about that data and we don't have a place where people can go to get that information. And it's not going to be pretty if we require people to access metadata in its native tools. And so the question you may want to ask is, yeah, we could give them access to the metadata in the native tools if that metadata is even being collected somewhere. Having a data catalog gives you that place to be able to manage the information about your data. It gives you the opportunity to activate a tool to activate your stewards to provide an active data governance program. So think about the issues that are going to be caused if you try to give people access to metadata in these unstructured sources and manage the metadata in these unstructured sources. Typically, the options are not pretty when we try to activate a data governance program without a data catalog. So I often say that the metadata will not define itself. Well, and this includes the definition of what metadata we're going to collect because there's lots of different types of metadata that we can collect. But we really need to identify that metadata that's going to add the most value to the organization. Who's going to define the metadata itself? Who's going to define the value of the metadata that people are going to get access to in the catalog definition of where people need to go to get to the metadata or to be able to deliver the metadata with the data? So I know I was just talking to a client earlier today about being able to hover over certain fields and have definitions come up for their fields in their active business philosophy and active data dictionary. So steward activation is a must when it comes to defining the metadata of the organization. So when I say that the metadata will govern itself it certainly won't produce itself. I mean, who's going to be who's basically going to put the definitions in about the data? I mean, you can get the physical structure from the catalog, from the database catalog but it won't get the business definition. It won't get the valid values, the meaning of the values. So the metadata is not going to produce itself. The production of the metadata collected in the catalog requires activation of stewards, the production of the metadata itself, the production of the value, being able to demonstrate the value from what's being stored in the catalog. All of these things are going to require the activation of the stewards because again, there's no magic solution to governing your metadata. It's going to require efforts. It's going to require that the stewards get actively involved. And the last thing being the metadata is not going to use itself. So usage of the metadata, we need to make certain that people know what metadata exists in the catalog. So when they go to the marketplace they can certainly know what information about the data is being made available to them. The usage value of the metadata, somebody needs to help people to understand that. And the usage of the metadata that you're going to deliver hand-in-hand with the data. The activation of the activation of the data stewards is truly a must when it comes to defining, producing and using metadata and data itself. So as I said before, there is no magical pixie dust that you can sprinkle over your organization that will cause metadata to appear in the catalog. It's going to take effort. It's going to take resources to do that. Somebody is going to need to define produce and produce that metadata in order to have it add value to how it's going to be used. So the metadata will not magically appear in the catalog without it being defined, without it being produced, and without it being used. And the truth is that the metadata and the data catalog tool needs to be stewarded and it needs to be governed. And not only are we going to activate the data stewards of the organization, we're going to activate the metadata stewards as well. And that's truly the way of using a catalog to help to activate your data governance program. So what did I talk about today? I know I went through a lot of things relatively quickly, but the things that I wanted to really focus on was that comparison of what an active program looks like, what a passive program looks like, how do they kind of compare what are the advantages and disadvantages to each? What does it mean to have an active program? How we can use a data catalog tool? So the name of the session being activate data governance using the data catalog. I talked about how we can use a data catalog tool to activate the data stewards, to get them readily involved, not be passive around the management of data, talk about the role a catalog plays in data governance. And I wanted to end with just giving you that point again that if we don't activate people around the data and around the metadata, it's not going to govern itself. It needs to be governed through a resolute effort and you need to have a place to be able to house that information. And so with that, I've come to the end of my portion and I wanted to toss it back to Shannon to see if we have any questions today. Bob, thank you so much as always. And again, thanks to Stuart for both the great presentations. Just to answer the most commonly asked questions, just a reminder, I will send a follow-up email to all registrants by end of day Monday for this webinar with links to the slides and links to the recording along with anything else. So diving in here, can a data governance program go from passive to active? We're just starting ours up and it feels more passive at this point. Well, I'll take that first and then maybe Stuart if you wanted to test on that as well. Certainly you can move from being an active program or from a passive program to an active program. I would not suggest trying to go the other direction. Certainly what you can do is if you have a passive program where people have been identified or they've been assigned to be data stewards, get it to the point where they recognize that the actions that they're taking with data that they're being held formally accountable, that they are stewards and that it's kind of a recognition. It's a positive thing for them to be engaged and to become active in the activities of defining, producing and using data. You can also move from passive to active by implementing a tool and getting people active in the definition, the production and the usage of the metadata. And that's the way we're going to make certain that that metadata, that information is made available. And certainly that would be another way to be able to activate those passive data stewards. I think that's a great answer, Bob. What I would add is what we see also is getting the data governance closer to the data access itself. That's a big focus of our platform, not just a shield for that, but what you want is it not to be a side shore. You want to move it as far into the day-to-day activities of the people that care about data as you can. So you don't want to have this big divorce or big gap between your data governance efforts and your data efforts because then it feels like this extra thing which can keep people pretty passive. I agree with that answer completely. And what kind of organizational cultural traits have you seen where an active data governance approach works better versus a passive one? Well, I would say some of the traits are that senior leadership within the organization really understand how you're approaching data governance as a discipline. And if you can get them to understand that they're going to be able to support and they're going to be able to sponsor you better, but getting them to understand that the better approach is to move these people from being passive to get them to be active. So get them to understand that the stewards themselves, it's not something that you can opt into or opt out of. You're basically by the relationships that you have with the data. Most of your senior leadership will understand that we need to hold these people formally accountable, especially for how they use the data. But I think one of the biggest traits to moving towards an active program would be to get senior leadership to really understand how the activity is going to add value to the program. Yeah, I think you nailed it. I spend my life talking to people about how they're trying to sell their initiatives and invest money in platforms and tools to enable it. And it's often about you got to align to something that senior leadership cares about. Get to the business initiatives and say we can't accomplish those without data. And we can't do that confidently without having programs in place to support that data. Right. So being heavily aligned there. And then on the ground level, I talk about having carrots for the audience that you want to uptake those behaviors. Whatever that is for your company culture, can you gamify it? Can you have a competition? Can you bonus? What can you do to get people to kind of engage and take that first sip and realize like there's lots of benefits as well as some effort, right? And I think those are critical. I call it carrots and sticks a lot. You need a stick at the top to help you and you need carrots to get the people going. And I've talked a lot about gamification, about making it interesting to people, even having friendly competition between different departments within an organization. Something to get people engaged, something to get people active. Data governance isn't necessarily the most fascinating. Oh, I like to think it's a great subject. I love to talk about the subject. I love to get engaged with clients and helping them to activate their programs. But it's still, it can be boring to people that are not passionate about the data. So you've got to make it interesting and gamification is certainly one way to activate people. Yeah, I love it. I always say nobody does data governance for fun. Like maybe a few of us on this call, but that's a small part of the population. I do. I do. I do. I do. I do. I do. I do. I do. I love it. So there's a lot of questions around this too. You know, this question says, I'm struggling with the fact that governing data is pushed back by governing behavior in active data governance. I understand the role of behavior, but governing data is also key in active data governance. Can you comment on that? Well, I think, and I've talked a lot about this in the webinar about activating your program through the stewards. And the stewards are the ones that need to be held formally accountable for what they're doing. Certainly when it comes to usage of the data, maybe definitely when it comes to definition and production of the data. So I think that, you know, we need to continue to engage people that way. And I kind of lost my train of thought. So go back to the question again real quickly. Sure. Yeah. And so the comment was I'm struggling with the fact that governing data is pushed back by governing behavior. Behavior. Well, that is, that is what people, that's what we're governing. I mean, the data is going to be defined the way that we define it. And if we have a good structure for how we're defining that data, we have a standard for what a business definition looks like. You know, you're changing people's behavior. You're certainly, when it comes to security and the protection of sensitive information and privacy, that's people's behavior. So I don't know if you want to use the word behavior because it itself may have some negative connotations, but you're really formalizing what people do. There's really no reason to kind of cover up the word behavior unless it's something that's really viewed as being kind of dictatorial or parents. It feels like your parents are telling you that you better behave. Well, it is the definition, the production and the usage of data. And those are certainly behaviors associated with data. I don't know, Stuart, what do you think? I mean, you're right at it. I think I'm always like, work from the place people touch the data backwards. Too many times people start in the bowels, get lost in the bowels, and never get out to... And so make sure you're always answering that business question. Make sure you're always starting from there and then working your way backwards to govern those questions. I totally agree. Okay, great. All right, I'm going to slip in one more question here, but we will get all the questions out and answered for you in the follow-up emails. So keep them coming. What is your recommendation for how to work with hundreds or thousands of data stewards when you only have a limited number of licenses for your data catalog tool? Oh, I don't want Stuart to answer that question. Don't even tee me up. I was about to start typing. Well, first I would say get a catalog like data.world. You guys let me infomercial that one way too hard because we don't charge for author-based pricing because that actually makes it really... Like you're paying for stewards instead of adoption across your enterprise. We just don't believe in it. And it's also hard as heck to audit. So I can't totally differ you. There are some creative ideas in here. Using files and APIs can sometimes help get around some of those scenarios. But yeah, you really want a catalog for adoption by a broad audience more than you want just paying for those stewards because then you get a platform used by stewards and that's not the whole goal. So anyway, let's feed me up for that one. So I appreciate it. And the fact is that you do... So having thousands of people and not only those thousands of people aren't in the same place. At least I hope they're not all in the same place that they can be scattered around the country, scattered around the world. And I just wanted to point out kind of the obvious thing which is that they need to be educated in the fact that there is even a data catalog tool or that they and be educated or at least taught how to get to the information within the tool. I love the answer that you gave Stuart. I think that that's kind of the... What's necessary is to get this into the hands of as many people who can get benefit from it as possible. And it's not just the stewards, not just the people that are defining, producing and using. Well, maybe not just the people that are defining and producing the metadata but the people that are using the data and therefore have to have information about it through the metadata. So I agree, just get the data and get it. You know, you need to communicate with people that you've got that information stored somewhere and then let them go to town. I mean, this is a resource for the masses. I mean, if there's something that's going to address data democratization, it's going to be a data catalog tool. I love it. Well, that is all the time we have for today's session. Again, thank you both so much for another great presentation and information. And thanks to all of our attendees for being so engaged in everything we do. Just love it, love all the chat and love all the communication going on. Just again, a reminder, I will send a follow-up email by end of day Monday with links to the slides and links to the recording for everybody. So if y'all have a great day, thanks to Data.world for sponsoring and helping make these webinars happen. Thanks, y'all. Thanks, everybody. Thanks, guys.