 Oh, hello and welcome my name is Shannon Kemp and I'm the Chief Digital Officer of DataVercity. We would like to thank you for joining this DataVercity webinar, Stop the Madness practical guide to making your data catalog strategy work sponsored today by Metric Insights. Just a couple of points to get us started due to the large number of people that attend these sessions you will be muted during the webinar. For questions we will be collecting them via the Q&A or if you'd like to tweet We encourage you to share how these are questions via Twitter using hashtag DataVercity and if you'd like to chat with us or with each other we certainly encourage you to do so and just to know the Zoom chat defaults to send you just the panelists but you may absolutely change it to network with everyone and to find the Q&A or the chat panels you may click those icons in the bottom middle of your screen for those features and as always we will send a follow up email within two business days containing links to the slides the recording of the session and any additional information requested throughout the webinar. Now let me introduce to our speakers for today Mike Smithman and Marius Moscovici. Mike is the VP of Sales and Marketing at Metric Insights and has over 15 years of product and marketing experience in the business intelligence industry. He helped bring analytic products to market with senior roles at Seagate Software, AIM Technology, T-Leaf, Xsera and Good Data. Marius has over 20 years of experience in analytics and data warehousing. Marius is the CEO of Metric Insights, the leading fighter of a BI portal that helps organizations organize their BI environments and ensure users are getting the actionable data they need and with that I will give the Florida, Marius and Mike to get today's webinar started. Hello and welcome. Thank you. Thank you so much for having us and we're excited to be here. Wanted to talk today about governance and I think you know the title is about stopping the madness because we just see sort of the same kind of an error repeating itself over and over. I just can't tell you how many, how many, you know, customers and other partners we've talked to who just describe a lot of the challenges that they're having around in deploying governance tools effectively. And they're either just, you know, Jesus, we're just getting started and really having a hard time figuring out how to make a take advantage of the data governance tools we have or, you know, we purchased this tool and we've had it in place for for years and it's not really getting a lot of traction within the organization and so what's going on there. And so we're going to talk today about that and hopefully give you some really very specific tangible, you know, practical advice as to how to make sure that you are getting of the most of your data governance tools and you're leveraging and getting success out of that so that you're not part of that, you know, 80% of data governance initiatives to fail according to Gartner because that's not where you want to be. So we're going to talk about, you know, why people do it, governance, you know, what the, what are the drivers, what are the things that usually initiate these projects, how people are typically trying to solve governance problems and then, and then what is the, you know, why some of the approach around data governance today doesn't really work. You know, where is it that it just fails and it fails repeatedly. And then we're going to talk about how you can leverage really a BI portal as some of the missing pieces in this whole puzzle to really fit everything together and make it work. So first, like, you know, what is it, how does this all happen, right? So that governance initiatives typically get initiated because people feel pain, right? And the pain comes from lack of governance, of course, right? So what are the kinds of things that people see that drive governance initiatives? Well, of course, it's compliance. I've got to make sure GDPR regulations are adhered to. I've got maybe other regulatory requirements that are there around making sure that I know where my PII data is. I'm handling all the data in a responsible way. All those things are driving initially and probably maybe the primary drivers for starting up governance initiatives. But there's a lot more beyond compliance. So people want to enable data discoverability, the idea of saying, well, what use is it if I have all this amazing data and all these tables and all these sources, if some new analyst comes on board and they are not aware of the fact that the data is present in this particular data warehouse that can serve them, it can answer a particular question, well, then it's no use to them. And if you're only able to see the things that you're given access to, then there's a lot of limitations on how you can work if you can't discover and request access. Consistency is very important. So obviously, I want to make sure that your data is clean and that it's same definitions apply across the board. So if you're measuring sales, it's being measured the same way irrespective of which of a set of tables that you're connecting to. Lineage, people want to be able to understand, hey, where does this data come from? Is this originally coming from our sales number originally coming from our Salesforce system? Is it coming from the data warehouse? What are the different hops that it's gone through to get here? All those things color the way that you're interacting with the information. And so as an analyst and often even as a business user, the lineage of the data that you're working with is critically important to make sure that you're using the information correctly. And then quality. So you want to have a grasp on data quality, be able to know when it's not there. So when there's a problem with the data, be able to identify it, but also be able to know areas where there have been known data quality issues and things that people are resolving and working on. So identifying that and then generating transparency around data quality is a key aspect of folks that are working with initiated data governance projects. And so what happens? Well, typically somebody looks around and says, well, what are the technologies out that they can help me with this suite of problems that we just talked about? And the answer comes back, aha, it's the data catalog. There are all these technologies, whether it's Purview or Alation or Calibra or Google's data catalog or whatever, you choose some piece of technology that you say, well, handle all these kinds of features as features that support those requirements and that I'm going to be able to use this technology to solve the problem. And what happens after that? Well, you deploy the catalog in your environments. Then next, you bring in all your metadata. So you crawl all of the different data sources you might have. Maybe there are things out in your cloud, some of your data warehouse, files, your Hadoop cluster, whatever you have, it kind of goes out there and crawls everything, your BI tools, and brings all, it sucks in all of the metadata associated with everything you have. And then you go in and set future capabilities to say, I want to auto-attack sensitive data. So now you know, where is the PII data? Where is it that we have names and social security numbers and emails in the data? Make sure you've got a good inventory of that and control over that. These tools will generate lineage for you. So you'll be able to figure out maybe if you're crawling the scripts, the ATL scripts, or even using something like Informatica. It might be able to tell you, you know, hot weather, is this data coming from? What are the transformation rules that are being used to populate it into the target from the source? It might even be able to tell you all the way to the BI tool, where you can look and see I have this Power BI dashboard. And I can see that this data is coming from a particular data set that is being sourced from my SQL server environment. And here's how it all gets rolled up into the solution I'm looking at. And then once you've got all this automatically collected data, oftentimes it's an initiative to create a glossary within the system. So to say, well, let's make sure that we define these key terms like sales and customer and all the things that I'm interested in measuring, can find those enterprise KPIs in a way that is consistent. So that now I can say, I'm going to get buying from the business, this is how we're going to define these things, this is how we're going to measure them everywhere that we use them. So that's often part of this initiative as well. So then finally, ownership. So you want to be able to say, well, here's the particular business user that's responsible for this glossary term. That's responsible for, here's the technical users who are responsible for these particular Tableau dashboards or these data sets that you've onboarded into these data catalogs. They're the ones that build the rules and the ones that are maintaining it. They're the ones that you need to go to if you either want to request access to it because you don't have it, or you have a question about how to use the information. So all of that information all gets encapsulated, loaded either on an automated fashion or manually into your data catalog. And once you're done with that, you maybe flag some data quality issues to identify to people what are the issues where they should be aware of problems. And then you launch. So what happens after that? Well, you announce to the whole organization, here's this cool data, we have this new data cataloging tool. You can get all this information about business terms and lineage and data. And if you discover data, you can request access data. It's all there for you. Go at it. And what you'll typically find is that, well, initially, there's quite a bit of interest in this and that everyone's very curious. What is this thing? What do I find? What can I find in it? You get a spike in adoption, where lots of folks are logging on. But then surely as the sun rises, it sets as well. And unfortunately, very quickly, and then we see this over and over again, usage of these technologies just drops off precipitously. So three months, six months after the launch, if you look back and you say, well, who's using this? No business users are using it. And then out of the entire analyst community, you're lucky if you got maybe 10% of them are actually engaged with it. And the rest of them are not going back to it. They're just not getting the value out. And this is where really the big disappointment comes and where people really scratch their head and they're not really sure what to do next. And so to address this, I think the first step is to really understand what's wrong with that approach. Why didn't that not work? Why can't we just bring in all this really valuable metadata, add these valuable glossary terms in place, assign ownership, organize all the stuff, launch it, and have it be something that people are really engaged with in a sustained way. So it's generating ongoing value. So looking at that, there are a couple of reasons, there are three primary reasons I would maintain, why this is sort of doomed to failure, why it's madness to assume that this kind of approach would work. First of all, these tools, really it's true about all of them, they're complex and fairly difficult to use. They're really powerful technologies. So data cataloging, you can do all these things. You can mine all of your tables and identify PI data and do these associations and bring in information and link things to BI tools. They're incredibly powerful, but they're also very, very complex. They have to support different kinds of workflows, all kinds of different user requirements, every organization is a bit different. And so they're not something that you can just sort of log into without any understanding of what technology is, what that technology is on a casual basis, like once a month and use and come back to very easily. There's a learning curve involved. And that learning curve is a very significant barrier. The second big obstacle to making this successful is that inherently there's a huge amount of clutter in these environments. When you think about it, you have a data catalog tool, its whole purpose is to go out and crawl and bring in the metadata for everything that is in your environment. Well, the good news is you have everything in your environment. And the bad news is you have everything in your environment. You're not like you've got not only the really high value fact tables in your data warehouse that have been meticulously created, carefully maintained through data quality scripts and validation and so forth. But you also have the table that it was created three years ago by an analyst that's no longer there, that's a duplicate of another table that's not being maintained. That's being brought in as well. And in most organizations, organizations of any size, there are so many of those objects that are really not high value projects that the clutter is insurmountable. If I even do a search inside your data catalog, you get something useful but you also get just a ton of junk. So that's a huge barrier to adoption. And then finally, in many ways, maybe the most importantly, if you think about it, the data catalog tool is really not part of anyone's user flow. Maybe apart from either governance team or the folks that are maintaining that technology, it's certainly not a part of the business user's flow. What are your business users doing? They're going into either a dashboard to find information or maybe they're looking at spreadsheets on SharePoint, they're going to all the places where their information resides and looking at that information. They're not thinking, oh, let me go to the catalog to see what glossary definition term is for sales. That's disconnected from their experience. But also your analysts, they're not really, they have a flow that's completely divorced from these tools as well because they're inside of the BI tools. So they're in Power BI, they're in Tableau, they're in Click, they're building out, or maybe they're in Excel, building out a model. They're building things out for consumption by the business users. And very seldom, why would they think to go to a data catalog? Unless they're specifically have a very compelling reason to just get to continue doing what they're doing. And so it's just a natural progression that as you roll these tools out, unless there's a way to reach them without information, the information will not get to them. They just won't go there. So what do you do about that? And clearly these are just sort of fundamental, behavioral challenges. So it's not, you can't make the data catalog tool really simple because it's not, right? And nor can you change workflows when people work. And all of that information has to be crawled and has to be presented. So we maintain, what I would say is sort of a practical approach here is to really take another step, is to recognize that implementing the data catalog is a first and necessary step to really get your governance in order, but it's not sufficient to solve the entire problem. In that in order to solve the entire problem, you have to take that metadata that you've collected in the catalog, not all of it, but the good stuff. And you wanna make sure we deliver it to the point of consumption. That is to the place where the analyst and the business user are interacting with your data assets every day, right? So that mean, and when we talk about metadata here, we're not just talking about just, I should clarify, I'm talking about more nuanced view of things. So clearly, for example, the metadata that you have for governance-specific, some of it's in the data catalog, some of it's some of the things that that data catalog is for all, but not all of it, because there may be BI tools that contain some of the data, metadata that haven't been integrated in the catalog. Typically, a data catalog might work with Power BI or with Tableau, but not with both, or maybe it works with both, but not with click. So you have a, there's gonna be some metadata that's gonna be present in the BI tools, it's not a catalog. And you need to present that to users. You also have metadata, oftentimes it's in Excel. So there's a whole class of reporting assets, sort of user-generated reports, which are not being created in a BI tool. They're probably not in your data catalog at all, but these also have metadata that has to be captured. And oftentimes it's metadata that the business itself is maintaining, tagging, glossary term definitions, things of that nature that maybe they haven't, they're not using the data catalog every day, they don't want to, but it's easy enough for them to maintain it in Excel. So you've got to have a solution that brings that metadata in, knits it together with the data catalog of the BI tool, and then brings it in at the point of consumption. And then finally, when you onboard your content, the natural process of saying, you know, I'm gonna have a governed space and I'm gonna bring in all my different BI assets, I'm gonna bring it into this government environment, that process as well oftentimes has an opportunity to collect metadata, maybe adding a description where one was not very clearly defined, or maybe making some associations to glossary terms or times or adding custom information, things that help with data literacy, that help people round out the metadata that's already there. There has to be a process by which you can collect and absorb that into the whole ecosystem that you're presenting to users at the point of consumption. And a key principle, absolutely pivotal principle to think about as you think about surfacing metadata, and as even this applies to content itself, right? Is that there's really this iceberg phenomena that happens with our information assets, whether they be data sets inside of a data warehouse or they be BI tool, dashboard reports, things of that nature. And that is that, if you think of that iceberg as that classic, 90% of the iceberg is below the water, right? And you see the top 10%. So you want the same thing here. You wanna take the top small percentage of content that's really useful and make sure that that is the thing that's above the waterline. That's the stuff that's discoverable to users, right? And these are the things that are vetted, that has quality that has, they're not duplicative, et cetera, whether it's the data or the BI assets, applies in both cases. And then the rest of it, the vast massive amount of stuff that's there, but it's really not very useful. Keep below the water, keep it out of view because then users can have an appreciation of what's there and they can really understand how to engage with information. And you're overwhelming them with clutter as they're trying to find, that piece of useful information, that data set that I'm gonna use to build my analysis or that dashboard report that I'm gonna use to be able to understand what's going on in my business, right? I wanna make sure that all the clutter, the stuff that's redundant, the things that are obsolete, they're just hidden from. So this is a key, key principle to keep in mind. And then also very important, you wanna think about this whole journey and the way that you're gonna support people from a government's perspective in a way where you're not just sort of a one size fits all approach, right? Because the governance needs and the consumption needs for somebody who's a business user, they're looking for something completely different from the content publisher, the person who's creating Tableau and Power BI dashboards or pushing spreadsheets out, use broadly within the organization. Those experiences are very different and their needs are very different from the data governance profession. The person who's charged with, hey, we gotta be in compliance and we gotta make sure these governance initiatives are really generating the value that we expect them to. And so your solution that you come up with needs to be carefully targeted towards what the persona that will work, right? And so let's do a kind of a deeper dive into that when we need by that. And then what we'll look to do is to give you some actual examples. You know, we don't want this presentation to just be about the theoretical, wanna give you some examples of kind of how does this play out in reality? You know, if you create a governed environment, that brings all of your governance metadata to the user at the point of consumption, what are they getting from? So from a business user perspective, what do they want if you think about that persona? Well, as a business user, I want to find everything in a single place, right? The definition of good governance is that there is a single pane of glass through which I consume all my content and there's a filter applied so that that junk is not there, right? And that's what I care about as a business user. I wanna be able to know, I'm looking at this, I can trust whatever I find, I'm not gonna be stumbling into something that after I do my analysis, someone will tell me, oh, you know, that's obsolete, there's a completely new dashboard that you're supposed to be using for that. That's not gonna happen. I also wanna be able to find content from anywhere, right? So I should be able to be in my desktop on the portal, I should be, I can do it in Slack or Microsoft Teams, I can do it on mobile. So kind of having access to the information that I want to be able to consume from anywhere from any surface is gonna be very important to me. And then it's very important to be able to identify, find and request access to discoverable content. So security is obviously a paramount importance, right? Or you're going to have an access model in place that controls who has access to what content. But unless it's an HR report or some very sensitive piece of information that you want to hide the existence of that asset from everyone, unless you have that, more often than not, you want to make sure people know that there is a report that exists with this particular title, with this description, with this general metadata, even if they don't have access to it, there's lots of assets that fall into this category, because for two reasons, one is a business user, you want people to know, oh, okay, this exists, let me go request access to this. So that I don't, you know, rather than, oh, I'm gonna go and ask my analyst to go create this. And now I have a duplicate piece of content. Somebody's wasted their time rebuilding something and did not exist before. And I just didn't know that it didn't exist because I was looking for it, didn't find it because it was hidden from. So having discoverable content and being able to request access is a key aspect of good governance, right? And then from a data literacy perspective as a business user, I need to be able to understand what am I looking at? So if I'm seeing a report and it's measuring sales, well, how are sales defined here? And where does it come from? What are the key rules? Is this gross sales or net sales, et cetera and so forth, right? So I wanna be able to understand all the information, the definition, the extended metadata, the lineage that will tell me what I'm looking at, help me gain deeper understanding of the information and make sure that I'm using that information correctly and not misusing it because I misinterpreted it from a data literacy perspective. And then I also want to make sure as a business users that I understand what the data usage constraints are around the data. What I mean by that? Well, if I'm looking at a report and it contains a restricted data, what does that mean? Am I allowed to share that only with certain parts of my organization, but not everyone else? If it's flagless confidential, does that mean it can go to anyone in my organization or only certain groups? And what about partners? And which data sets, which reports have contain public information that I can share broadly, right? So making sure that I really, and I not only see that something is restricted, but I understand how that is supposed to be used in a way that will keep me in adherence with the key governance practices that are in place in the organization, right? So with that, let's show you some examples. Yeah, thanks, Marius. So I'm gonna take over for a minute here and kind of talk about what this might look like in practice. Obviously, looking at our platform here, but generally the concepts involved with what Marius has been speaking about. So from a business user perspective, as Marius said, a lot of business users are not really leveraging the data catalog on a daily basis. For this example, we're gonna be showing some examples with Elation, but whatever your catalog might be. And so the first step really is being able to bubble up the tip of that iceberg as Marius was talking about from a content perspective into a centralized space curates just the assets and the content from whatever source it might be coming from into a centralized catalog or portal that business users will come to and will engage with to look at the content that they're interested in on a daily basis. And an example is here where each of these tiles here is content that we've curated either directly or through the data catalog into the centralized space coming from, as you can see, BI tools, maybe Excel files on file stores, documents up on SharePoint to individual metrics and data sets that might be interesting to a particular set of users. So step one is about doing that sort of curation of content for the business user and getting it into a space where, again, it is more user-friendly and that they can engage with it. And during that process, as we said about before and as Marius alluded to, through the process of curating that content and publishing it, we're gonna be pulling in metadata from a number of different sources, whether it's the data catalog as we'll see, whether it's the BI tool itself, whether it's manually or spreadsheets, regardless of where that metadata is being captured, we wanna make sure that we deliver it alongside the asset so that the user knows exactly the context behind it and there's a litter around it. So in this scenario, clicking on this Tableau dashboard here, we get a preview of it coming from the underlying BI tool. We probably pick up a name and description from the underlying BI tool as well. But that's augmented in this case with some fields that are coming, both formulation in this case, defining that the data is confidential, that it's been through an approval process, that it's relevant to a particular line of business. We're also linked up to Kaliber and our demo environment. You're not gonna have to do data catalogs once hard enough. But again, wherever the source is, pulling that in and making sure it's delivered automatically alongside the asset itself. And then as Marius said, through the publishing process, you may add in additional information. So we may have a glossary that's been defined in the data catalog or that you're defining as part of the publishing process where we define key metrics or indicators, how they're calculated. We talked about ownership, who is responsible for these definitions and these assets that we're looking at so that there's accountability behind them. So as a user, whether I'm looking at a Tableau dashboard, whether I'm looking at the spreadsheet, which is what we have here, that the result is the same. I get the context behind it. I get the metadata behind it. And obviously if it's something that I'm interested in, I can click into that particular asset and interact with it as I would usually, but with the context of all that metadata that we spoke about before. And for some users, it might make sense to be able to get back to the catalog. But rather than having to search through that and find this asset, as Marius said, it being sort of outside of my workflow. If I do wanna go into the catalog and look at this, then it should be linked to it. I should be able to click through now we're into Alation. We're looking at this asset with an Alation and I've got potentially other metadata here that we haven't deemed is sort of necessary for everyone from a business perspective. And I can drill into that in more detail, but it becomes seamless and I can see it as part of my workflow. So getting everything in this sort of single pane of glasses is critical, linking it back and making sure that we're collecting the necessary metadata automatically and we're associating that with the relevant assets is kind of the backbone behind getting the catalog information out for the users. Using that then within a search paradigm. So being able to come in as a user who's got access to a whole lot of categories of content across different tools and technologies, but to be able to come in and search across that catalog making use of all the metadata that we've ingested to understand what it is a particular user is looking for, being able to filter out searches by that metadata so that we can really narrow in on again what a user might be looking for out of the body of content that they've got access to. So leveraging all that work that you've done in the catalog again from a business user perspective as it relates to searching only searching typically through that curated content is critical. So here you can see I've got a ranked set of results. Here's all the metadata that we've associated with it and that's effectively what we're searching through to bring back this set of results. And again though, for some users maybe they want to go beyond that set of curated content. So maybe we do want to search back into in our case the Elation catalog. So being able to do that seamlessly again as part of my workflow, I don't have to open up another application and run the same search and figure out okay, what was in there versus not, passing that search seamlessly into your data catalog tools so that you can leverage that search engine, look beyond it, see actually that there's multiple versions of this report out there. And this is the one that we've actually curated into the catalog. Maybe I'm looking into that sort of stuff because I want to see what other versions are available. So again, incorporating that into the workflow is kind of critical to driving usage again of the underlying catalog tool as well. So from a business user perspective, making sure you have this single pane of glass, making sure they're looking through the content that is only relevant to them, helps drive that engagement and get usage of the work that you're putting in the backend. The other piece that the Marius has touched on a few times from a particularly business user perspective, but really everyone is typically when you're curating something into a centralized catalog, you're going to apply and synchronize the necessary permissions and security around that. So typically when I come in as a user, I'm logged in as Sam here, they're responsible for sales and marketing. Typically my commissions are going to dictate what content I get access to, what I search through based on my role in the organization. However, most content, again, as Marius said, unless it's super sensitive and you don't want people to know it exists, most content you wanna make what we would call discoverable. So even though I don't have access to something, if I'm interested in searching for reports around procurement data, I should be able to run those searches, it looked through the metadata and it highlight any discoverable content related to procurement. So in this case, I can see this procurement dashboard, I can see the metadata associated with it, what it contains, the classification of the data, who owns it, the tags and glossary terms associated with it. Even though I don't have access to it because it's blurred out the image and the path lock on it, I can see that. But I should be able to go and request access to that particular asset now rather than reaching out to my BI team and saying, hey, can you create me a procurement report because I didn't know one existed already. So making content discoverable beyond sort of the security and permissions that people have, again, is critical to sort of driving increased engagement with everything that you have out there that you want people looking for. Thank you, Mike. So we've sort of looked a little bit at the first persona, the business user. And I think I'm shifting gears in a moment if we think about the content publisher. This is the person who is responsible for creating the content that the business user is gonna be looking at every day. Data have needs from a governance perspective. Do you wanna make sure you're addressing and filling in the gaps beyond just here's the data count, right? They wanna be able to search for content and they wanna be able to figure out like what are the certified data sets that I should use to build a new visualization. And this is particularly important if maybe it's a new data list or somebody just coming into our data list moving from one group to another. Now there's new domain. They don't know already in detail all the different tables what they should rely on. All too often you need to talk to people and get that insights just directly and if the right person is not available to talk to you, you're out of luck, right? So being able to figure out, okay, here are the trusted data sets, certified data sets that I can build visualizations from, very important. Understanding kind of lineage, right? So I can see, well, if I'm looking at, try to do an analysis, obviously as a good analyst I'm almost always first checking to see what already exists out there. So I might see, here's a dashboard. Well, I wanna know what's the lineage for that and what, how to, and the published data sets and understanding when perhaps with this dashboard contained information I already need. Well, oh, it's coming from this particular source that I know is a trusted source. And therefore I can go ahead and use this as the basis of my analysis. Tracking usage, I think it's very important for your content. So if I'm publishing content, you ought not to be kind of a fire and forget model, right? All too often people, analysts will build something, they'll give it to their users and then they'll kind of move on to the next thing down their queue. And they just don't pay any more attention to the thing that they just created. Well, that's fine in the moment but what you really need to do is be able to tune your, what you're gonna be working on based on how you're doing in the past. Are the things you're creating how to sustain engagement through the users? Of the six things you created in the last month were three of them incredibly successful where they're continuing to get engagement and three of them were not. Well, you wanna know which ones are which because you can identify then maybe some patterns that will help you make sure that you're spending your time as an analyst in a way that generates the most value going forward. So getting those usage trends across all the content in your system not just whether it's Tableau or Power BI or Spreadsheets or anything with any content you publish understanding how that's used very important from a governance perspective because then you can change your behavior going forward to maximize value and learn from what's working and what's not. And then the ability from a feedback cycle perspective it's important to not only look at the numeric counts not only how many people have actually visited this dashboard but like what do people think about it? In ideal world, you'd have somebody running surveys all the time and getting that information we use this but in reality, that's just not gonna happen. A few users in business analysts are not out there don't have time to go out there and continuously survey their user community to find out like what do they really think of the content that's been created that too busy creating new content. So you need to have a governance platform that allows that feedback to be automatically collected and shared with the content publisher. So they can say, oh, I can tell based on what people are saying oh, this report is missing some things, right? Or they really like the fact that how this is done or there's some confusing usage of colors or confusing design elements or whatever that might be, right? So whether it's positive or negative feedback having that feedback captured and circle back to the content publisher so they can improve that asset and also they can improve how they build future assets is vitally important. And then with all of this together the idea is that as a content publisher I need to have a mechanism to be able to promote my content, right? So if I've published the dashboard and I look at my usage trends and I find that oh, geez, I was expecting to get all the users in this 300 user community to look at it and in fact, only three are using it. Well, maybe some of them don't know about it, right? So how do I promote that content? How do I make sure that there's visibility and awareness about what happens? So, Mike will show you some examples of how that works. Yeah, so if we jump back into our catalog here so from a data set perspective really we're talking about the same thing, right? There's a lot of tables in our underlying sources we wanna be able to curate certain tables into our catalog so the analysts can confine them and then we wanna tie those to the lineage of the assets that we're looking at. So, alongside our BI tools here I've also got certified data sets down the bottom here that I can obviously find within the search as well if I'm responsible for data sets and being able to find those, I can look through them. The metadata is again ingested either from the catalog or the underlying tool wherever you're managing that metadata searching down to typically the column level so that again, as a content publisher I can look at what has been certified what has been has the necessary quality checks against it so that if I'm gonna build a particular asset I know that this is a particular set of data that I can trust. And if I drill into those again I may go through to the data catalog and look at the details I may just look at the details here within the portal what columns does this contain how's it tagged, who owns it the things that we spoke about before but also tying into that lineage so again maybe coming from the underlying catalog or inferred through the tool here where I can see this particular data set is being used in a certain workbook certain worksheets within that workbook in Tableau in this case, beyond that in our case so it's being used in a number of email distributions that are going out and so I can tie this workbook back the Tableau one that we're looking at before back to the particular table or tables that are sourcing the data set behind it so lineage is useful to the content creator for sure but also potentially useful to the user who is looking at this particular dashboard maybe I'm thinking about creating another dashboard around sales I don't want to recreate things that are already there is this one using the data that I was planning on using well if it is then this probably has the info in that I need so lineage and metadata around data sets is critical for the content publisher second thing Mary spoke about was this idea of sort of tracking usage and collecting feedback and I think you really need both to really understand the true picture of whether a particular asset is working within the organization so again, through the tools and through the portal you want to be tracking what content is being used over time what sort of meeting a shelf life what's going unused get a view into the number of users and the number of views sort of particular report is getting so in this case we're analyzing this over the last 60 days we've got our most popular reports we've got reports that are increasing in popularity or data sets as well if we're using those decreasing what's going unused that maybe has become obsolete we need to get a view usage gives us sort of one view into the usability and the usefulness of certain content and that should be down to a pretty detailed level so oftentimes similar to sort of the charts Mary has showed in the slide when you launch a particular dashboard you will see a lot of usage if you promote that but over time if people don't actually find it useful they jumped in there to look at it so we saw a lot of views but over time that usage drops off then we need to do something about it so just the fact that this had 924 views over the last 60 days did not necessarily tell us that it's a useful dashboard because we might see things like this where this animation is showing us for this report in the middle there's a whole bunch of people who sort of came in when we promoted it and over time never came in again they're drifting to the outside of the circle and there's really only a very small percentage of the users who are coming back in every day and looking at this particular content and finding it useful so maybe we need to revisit it and understand why these other users are not utilizing it so getting a comprehensive view of usage is important but then understanding things like why aren't these people looking at it and the only way to really do that is to gather feedback from your community so at the point of consumption again on any particular report and spreadsheet or whatever it might be enabling users to give feedback to rate it to collect comments around the particular asset or dataset whatever it might be so that again if I'm the owner of this particular report I can understand both what was the engagement of this asset I can see it dropped off over time but also what was the feedback I was getting where it looks like we might have replaced this report with something else therefore maybe it is obsolete and I would sum it we should retire and take out the cutter that we have out there so usage and feedback is important using that then to be able to promote content is sort of critical then to driving engagement with that set of content that you want to see an increase in engagement with because you know it's useful Thank you Mike So the final persona that we want to talk about is data governance the group of folks that are responsible for making sure that the organization is in compliance with all those governance policies and that you're making the most use of the data and so they want to make sure that they can certify the reports and datasets that contain trusted data oftentimes that might mean more nuance than just a blanket certification perhaps that you know they might designate gold, silver, bronze certification levels or different rules associated with each of them designate which assets qualify for different certification levels basically they want to make sure all of that both the rules are maintained and then apply in a consistent way across all the assets that are there and then it's visible clearly to everyone in the organization when content is certified and when it's not they also want to make sure that sensitive data isn't classified properly right so obviously as we talked about at the beginning the you know compliance is one of the key drivers you want to make sure PII data is not shared in the way that's inappropriate you want to make sure any kind of restricted content if you're a public health company that you know restricted data is not communicated outside of the folks that are privileged and have the ability to consume that right because that would be a that could create all kinds of SEC violations and issues that you can find out so even just sensitive information and confidential information will make sure that doesn't get outside of the circle that it's supposed to be consumed by so making sure all that classification takes place and there's mechanisms to do that's very important and then equally important we need to make sure that the people that are consuming that content the business users that they understand which dashboards which reports have sensitive data and how they should use that data right because what's the use what's the point in doing all this work as from a governor's perspective and tagging those tables that have PII data if in fact the people consuming that data consuming it through a dashboard and they have no way of knowing or being flagged and hey this has sensitive data or this has you know confidential restricted information so making sure that that flows all the way to the point of consumption and then users understand how they should consume that information what are the limitations around that consumption very important and then if you're going to do that obviously being able to review and the compliance around that so to what extent do we have users that have gone and said your sense of content I acknowledge the fact that I understand how to use that content do I have a way to audit that and know so that I can guarantee that my practices are working and they're being communicated out so let's yeah so let's spend a couple of minutes looking at this this last persona and then I see some questions coming in so we'll leave some time at the end for those so yeah as Marius said you know certification of content is critical and again based on the the data catalog that you're using you know certification maybe happening as a workflow within the underlying catalog if it supports it potentially for certain content and you know obviously if you're doing that you know maybe you have an approval process or workflow or maybe the tool supports an actual certification flag if that happens you know obviously we want to curate that automatically into the catalog for users so in our catalog here you know anything that has a sort of stamp of approval on it of a certain level is because that has been certified by a particular user on a particular date and so again this may be you know synchronized up from your underlying catalog or it may be something that happens if the curation process is happening directly within the catalog so often what we'll see is you know we want to curate a set of content that exists in the organization but we want to push that through a certain sort of workflow process that ensures you know the certain stakeholders who are responsible for a particular asset have sort of done the checks and balances have maybe added in any additional manual metadata have done the data quality checks we want to push a particular asset you know that has been deemed something that we want to curate through stages of responsibility before it ultimately gets published and certified and so that the final result being that stamp of approval with again the accountability of who's done it so whether again it be a data set a dashboard or a report you know curating the tip of the iceberg but then certifying that content so users know they can trust it is critical so that when they come in and they're browsing and searching through it they've got the necessary context around that certification we also spoke about you know sort of the terms of service if you like around a particular asset so making sure business users understand sort of the sensitivity of information and the classification of information so in this case yeah we're simply publishing a spreadsheet to our end users and we're making that available within the catalog so a cash flow statement in this case I've got the necessary metadata that we spoke about before but in this case we've also made an announcement around this where we're informing the user that it does contain sensitive information we've recognized that and that there's actually usage policies around it which users are responsible for actually going in and accepting that they've read those and that they understand them and so you know delivering those policies alongside the asset directly to the user is critical in getting that information out there but then tracking that as well so you know understanding who has accepted the policies and who hasn't you know ensures that we can keep track of that you know we can alert around this to say okay you're out of all the people who have access to to that particular asset who are the ones who are yet to accept the policies so that we can reach out to them or make sure that they understand them so again if you're going to go through the effort of tagging content you know setting sensitive information classifying it making sure the end users understand how they should be using it and then you know understanding that they know how they should be using it is critical to getting that information out there so hopefully that's you know given you some ideas around how you might get some better traction with some of the governance and catalog efforts that you're putting in place but with that I guess Shannon maybe I'll hand it back to you to start fielding some of these questions that are coming in. Thank you both so much for this another great presentation great content a lot of questions coming in just to answer the most commonly asked questions just a reminder I will be sending a follow-up email to all registrants by end of day Thursday for this webinar with links to the slides links to the recording and anything else requested so diving in here is the how do you recommend we get the data owners or stewards classify the to classify all the content if we did the cloud migration before having a data catalog so hi oh I see so maybe how do you recommend we get the data owners or students classify all the content if we first migration so well so it's obviously a challenge to get people to do to do classification and do the work of capturing the information so that the key aspect of it is to be able to if you're gonna create this kind of government environment where you're promoting the things that are gonna be searchable into the space then the fact that the classification has take place needs to be part of the fact of promoting it to be governed right so you're saying okay you've done all this work to create your content now get the maximum usage out of it promote it in the environment to do that you need to add the classification information and the idea is that you wanna try to be really smart and make that process as lightweight as possible so to the extent possible obviously data governance tool should identify certain auto classify content and then if the things that you do want to manual classify be very mindful about what those things are and pick the minimum highest value things like PII flag you know data classification things that are vitally important make it easy for them so that kind of workflow that Mike showed is a great way to be able to do that where really with a click or two in publishing the asset they can simply do that right and then you can by virtue of ownership you can use the feedback loop where you can notify people hey there's all these assets you have for publishing that have not been acted upon going there and then in a few minutes they can apply those classifications so should we pick another question I guess oh yeah so sorry so can you clarify if metric insights is a standalone data catalog or if it works in tandem with a data catalog tool to support typical end user workflow so we are not a classic standard data catalog in that we are a BI portal solution that acts as a governance platform this is the way we think of ourselves so think about the fact that you want to bring all of your assets together in one governed environment but the classics or data catalog type functions like auto classification of information and discoverability API data all those things you're better suited doing all that in a classic data catalog solution and then we integrate with all of the existing data catalog solutions out there so that you can leverage that and bring that metadata into metric insights together with other metadata maybe it's not in the catalogs like things that people maintain the spreadsheets et cetera and then present that all to the user when they're consuming that dashboard that report or that self spreadsheet Perfect so how do you change the culture on keeping even metadata hidden especially from a federal government background? Yeah I mean obviously when you've got regulatory issues in place or if you're federal government situation you're bound by those if metadata has to be for some reason confidential or I know with government work there's often times there's classifications there's strict rules about even knowing about assets but even within that if you were to say okay maybe I can't let anyone, everyone know but oftentimes there's a larger population within a particular knowledge work or classification group that could be allowed to discover content but take a set of content that don't necessarily know about it because they're not been given direct access so I think the way to do that is not to have a it's more of a scalpel approach than a hammer you're not gonna change culture of a government organization obviously but you can look and say is there not all of the people that are able to through our current governance constraints are able to access content what percentage of them are actually assigned and can we be smart about making it the content discoverable to the rest of them by through implementing sort of group level permissions discoverability at a group level so how do you define a governance space who's responsible for maintaining it? So that's all I want to show so the governance space the idea of a governance space is that everything that we've shown you here like discoverability, the ability to see metadata ability to consume only the good stuff but getting that top 10% of the content rather than the entire iceberg that's what a governance space is about it's about separating the week from the chat and making sure that you're creating an experience that should be much more for end users while visiting a museum than just wandering around through the overgrown forest so if you think about museum, you're walking around and you're seeing like there's a little placard next to every painting and explains what's there paintings arranged together thematically in a way that it's coherent that guides in your journey so a governance space creates that kind of experience while at the same time ensuring that you're in compliance with all of your key regulatory constraints around your classification of data and so on and so forth so that's it and as far as who's responsible for maintaining it there are different models here so in some cases you can go with a centralized model typically this is more than what happens in smaller organization but you might have one central team that manages the entire space obviously for large enterprises that's not possible there's just too many assets and too many groups and so in that situation you have a hub and spoke model where you have a center of excellence that's responsible for establishing the standards of the guardrails if you like for the governance space they say here's the category structure here's where people where content's gonna live here's the basic rules here's what certification means here's the high level workflows and then the responsibility for populating the content it goes out to the spokes the potential business units that are the content creators that are building that content and that's what we typically see in large-scale enterprises where you're talking about thousands or tens of thousands of users I think maybe it ties into that next question Shannon, what do we mean by a single pane of glass? I mean, when we talk about this governance space this portal oftentimes business users in an organization have access to reporting in many different tools everything from a spreadsheet to on SharePoint to a dashboard in a BI tool to a report in Salesforce to a data science model sitting out there somewhere and we're just talking about having a single place to go and search and find that content not having to go to many different tools to access it and look at it so again, a single government probably mixing metaphors but a single governance space single pane of glass is the one place to manage everything No, I mean, I think the key distinction there is you know, you could put a whole bunch of links together so place and it's all one place, right? But if it's governed it really implies this much more the curated experience, right? That's the key Oh, Marius and Mike, thank you so much and there's so many great questions coming in but we'll make sure and get those over to you and thanks to all our attendees for being so engaged in everything we do but that does bring us to the top of the hour Again, just a reminder I will send a follow-up email to all registrants by end of day, Thursday with links to the slides recording and additional information Mike and Marius, thank you so much as always another great presentation and thanks to Metric Insights for bringing today's webinar Thanks y'all