 Hello and welcome. My name is Shannon Kemp and I'm the Chief Digital Manager of Dataversee. We want to thank you for joining the latest in the monthly webinar series data architecture strategies with Donna Burbank. Today, Donna will be discussing best practices and metadata management with Foster today by Irwin. Just a couple of points to get us started due to the large number of people that attend these sessions, you will be muted during the webinar. For questions, we will be collecting them by the Q&A panel or if you'd like to tweet, we encourage you to share highlights or questions by Twitter using hashtag DA Strategies. And if you'd like to chat with us or with each other, we certainly encourage you to do so. To open the chat on the Q&A panels, you will find those icons in the bottom middle of your screen to enable those features. And as always, we will send a follow up email within two business days containing links to the slides and recordings of the session and any additional information requested throughout the webinar. Now let me turn it over to Monique for a brief word from our sponsor Irwin. Monique, hello and welcome. Hi Shannon. Thank you Shannon and Dataversee for allowing us to be here today and welcome everyone from Portland, Oregon. I am also personally looking forward to hearing from Donna on this topic, but I'd like to share with you just a few thoughts on metadata management from Irwin's perspective before we get started. For those of you that may not know Irwin, Irwin is a two time Gartner Magic Quadrant Leader in the metadata management space, and we help hundreds of organizations every day to capture, take advantage of and fully leverage the metadata within the organizations in order to capitalize on the value of the data that they have across the enterprise as well as protect against its risk. Donna is going to get into much more detail around metadata management and what it provides, but in a nutshell, according to Gartner, metadata really is any data within your organization that's used to enhance the usability, comprehension utility or functionality of the data that you have. And metadata management solutions in essence provide that data in context for you. So some of the things that that metadata management helps you to understand about your data is again the data that you have where it comes from how it's used and how it flows and changes throughout your organization. Considerations that you need to be aware of such as the accuracy of the data that you have and the sensitivity that you need to protect against who is accountable for the data that you own. And when empowering users what rules and restrictions should you have in place in order to ensure that they appropriately leverage the data. Honestly, what can you capture regarding data knowledge within your organizations that others outside of it may know and can share to help further increase the data literacy within your organizations. Effective metadata management solutions really enable enterprise data visibility automation governance and collaboration across your entire organization. And they do this by bringing in the technical metadata throughout your organization that's associated with your technical data elements into a central data metadata repository. Effective metadata management solutions give you the tools to automatically discover and harvest this information, bringing it into the the metadata repository, and then be able to curate that information with additional details regarding technical characteristics, business characteristics and data fitness characteristics to provide a fuller context of the data that you have at hand. If it's integrated with a with governance capabilities metadata management solutions can also then help you to curate all of your your metadata or to govern the curated metadata that you have within your organizations. So helping to manage the data stewardship stewardship process, attaching ownership, attaching data classification and managing the rules and policies that you put in place in order to ensure that as business users are using the data that they're using it appropriately, as well as within context of understanding what is available to them. The metadata management solutions these days and don't only store metadata within a central repository, but they also activate the metadata to get rid of a lot of the previously manual processes and be able to generate different data discovery artifacts, such as data lineage impact analysis or such as the one that you see on the screen, which give a make it easy for business users and others throughout the organization, clearly understand the data assets that are available for them and how it relates to both the technical data within the organization and other other relationships that there may be within the business asset storage that you have. Additionally, metadata management can also be metadata within the metadata repository can also be activated and best of class solutions in order to drive other development processes and shorten the delivery time in order to empower business users. And lastly, efficient metadata management solutions effective effective should effectively also socialize the content within the metadata management solution with your business user community and be able to promote and encourage crowdsourced data knowledge from your community back into the data governance team and into the IT side of the house as well. So why has metadata management traditionally been difficult in the past, really at the core of it, a lot of these processes have been very manual in terms of data discovery and ingestion within a central repository. Additionally, a lot of the artifacts that we discussed such as data lineage impact analysis knowledge graphs were manual as well in order of being able to share that information and to keep the information the metadata current limited the reach of the currency of the data that you were sharing at large. Most metadata management solutions on the market today have a level of standard data connector offering that is available to help to automate the ingestion of metadata harvesting into the metadata repository. And so, so most do have some sort of ability to be able to harvest data at rest metadata from common industry data sources into the metadata repository at this point. They may differ in their approach of how they do that some providers may choose to work with external parties that specialize in connectors and outsource the library of connectors that they they offer, whereas others may dwell and use their internal professional services expertise in order to provide tighter integration and maybe more easy release compatibility as well and providing their own offering but most do have some level of standard data connector to help you automatically ingest the data arrest from the technical data assets across your organization. What is new, I think or or evolving really is this class of smart data connectors and those are connectors that leverage a much more advanced framework for automation and allow you to not only capture the the more common data at rest metadata but additionally the data at rest from sources that may be more complex in nature, as well as data in motion across your, your organization that you may be seeing via ET or being transformed via ETL or other processes. And by allowing the ability to be able to capture this this more complex data, you're able to provide your organization which with a much more complete view of your data landscape in its entirety, as well as more easily maintain and keep it keep it current. Additionally, smarter data connectors are able to as well not only help you get the metadata into the repository, but additionally activate that metadata to drive development processes to be able to streamline the development of new data pipelines or help you manage cloud migration and get from a current state to a future state much more quickly through cogeneration and such. This is really what's changed with metadata management. Gartner recently released a research report as well that says, basically in essence metadata management solutions are rapidly becoming more focused on what you can do actively with metadata rather than solely how much you can store within a metadata repository and so examples like that on the screen. Give you and give you an idea of the types of things that you can leverage metadata for in order to really drive far greater insight and new efficiency across your organization relative to the data that you have. If you look for in a best in class metadata management solution. We've suggested a few different items here that I'll let you take a look at, but at the highest level, I think that that I would advise that you concentrate on three primary areas. And the depth and breadth of the metadata harvesting automation available to you both from data at rest data in motion as well as potentially metadata that you're collecting already within the data models that you are working with. And how that automation as well as provided to not only help you ingest the metadata but to actually use it to be able to deliver and map your, your data landscape and deliver more data discovery tools to help you understand your, your data landscape. Two, what types of drill down visualizations or the discovery aids are you able to provide across your organization such as data lineage impact analysis, or knowledge graphs that help people to really understand and discover quickly what is available, and then understand how to use it or how to protect it as they use it. We are all being asked for increased, you know, proof of ROI for investments that we that we ask for within a business context, and being able to ensure that the metadata management that you're choosing makes it very easy for users, whether they're an IT data governance or they're in the business community at large, or risk and compliance for instance as well, to be able to quickly drill down, see and understand what's available to them is going to be very important to help you increase your data literacy quickly, and provide the ROI for different projects that you may be, may be looking at and to be able to see that impact and decision making. And then lastly, metadata management solutions on the market today come from a variety of perspectives some come from a deep IT perspective. Some come from a more of a business perspective and then about it IT capabilities later on. It's important as you take a look at the solutions available that you really evaluate across the board. What is the best fit for both IT and the business community together. What is it speaking to IT needs data governments needs and business needs in one offering. The last thing that that any of us needs is is to provide yet another siloed view of our data landscape or a siloed siloed repository that only one side of the organization is using, especially when we're when we're in the midst of trying to pursue that vision of a single source enterprise view of the data and the truths that are available to your organization. So I suggest that you consider carefully. How does it represent and address the needs of each of the communities, despite who initiates the search to begin with. So with that, thank you for allowing me to share just a few thoughts from Erwin's perspective. If you'd like to learn more about Erwin data intelligence by quest which is our metadata management solution, or to download a copy of the latest Gartner magic quadrant metadata management solutions assessment review all all solutions. Please visit us at Erwin.com. And again, appreciate the opportunity. Thank you Shannon and Donna and I look forward to the rest of the presentation. I mean, thank you so much for this great presentation and if you have questions from Monique about Erwin, she will be hanging out and answering questions in the Q&A portion at the end of today's presentation. And thanks to Erwin for helping to make these webinars happen. And now let me introduce the speaker of the monthly series Donna Burbank. Donna is a recognized industry expert in information management with over 20 years of experience helping organizations enrich their communities through data and information. She currently is a Managing Director of Global Data Strategy Limited where she assists organizations around the globe in driving value from their data. And with that, let me give the floor to Donna to begin her presentation. Hello, and welcome. Hello Shannon and the diversity crew. It's always nice to see some of the regular faces and names on the on the chat so we have a rich robust schedule ahead of us so I want to jump right in. As you know, if anyone's been on my sessions before metadata management is near and dear to my heart I sort of grew up in metadata before it was kept. So I'm finally glad it's coming to the forefront. I think mentioned there's just a lot of interest in metadata now. Just as I often say if this is the first webinar series you have attended. Be it known that there's an entire series we do this every month, everything is on demand at a diversity. So if any of those in the past were of interest to you you can catch the replay, not only on the diversity side but we also post them on our global data community we have a link as well. Next month just to call out to data quality, which is sort of the cousin to metadata, and we can have our guest Nigel Turner who he's often a popular speaker who joined just a couple times a year so please help join again if that's of interest to you. Right into it what we're going to cover today. So as Monique mentioned metadata is hotter than ever. Data diversity has done so some surveys this is always a popular topic when we have a discussion on it. And I think it's multifaceted from the business value and the business stakeholders, as well as the technical side. And when you test on that as well. A lot of reasons for that some is just that metadata has business value some are things like industry regulation. I think either of those really needs that better transparency and understanding of information even when we get into things like AI and big data and all of that, you still need the metadata behind it. And that data has been around for a very long time. But there are new ways of looking at it some new strategies approaches that we can talk about. And then how do you not your metadata strategy and kind of into a wider data strategy. And then the ultimate goal is how do you provide business value which is always a theme in these webinars. So we'll jump right into that. Many of you are familiar with our framework global data strategy, which really provides this kind of menu I guess of items you really need to make data sing and be successful in your organization. Everything should align with the business strategy and the data strategy and can both inform and support the business strategy metadata as you can see in the lower right is key to all of that. So it's sort of fun but also frustrating about this framework is that no matter what lens you look at sort of the Rubik's cube of data. You can sort of start at metadata and then broaden out because data metadata is part of a data architecture it supports quality it supports analytics and bi is a key part of governance right. So we're going to start with governance and say okay we're starting with governance and metadata is key to that so they really are all joined together, and each one is a full discipline in and of itself. And we try in this webinar series to try to do a deep dive on each of these as much as possible. And today's topic is metadata and we'll be diving into that but also in the context of how it supports these other areas. So metadata, metadata, as I like to determine and I'm glad everyone kind of picked up on that as well is metadata is data in context, one of my many pet peeves in life. I often use this, this soapbox to rant about data issues but is that metadata is such an intuitive thing it's such a helpful thing. And then we pick this really nerdy obfuscating word called metadata. And I think that does ourselves a disservice and then I will not have this in the presentation I will say it but once when we when people say what is metadata will say something like metadata is data about data, as if that doesn't make it even more complex. So, feel you feel free to use that if you wish I try to avoid that I just feel like we're sort of playing games with people when we use complex terms for really obvious context. I just say metadata is data in context that makes sense and then I jump right into examples, because generally metadata is fairly intuitive. So, many of you have seen folks kind of leverage this matrix I've put together. And it's just so simple as sort of the Zachman framework of metadata I guess, who what where I went and how. And these are some examples of, you know, often and metadata we focus sort of on the what. We'll talk about business what, which is the business definitions business context business rules, but it can also be kind of the technical what data types, etc. Who is super important we'll talk about that a little bit as it relates to governance, who's the data steward who's the owner who's regulating it who's auditing it who are the business users who's interested. We're watching each one of these where in terms of where is the data stored, we're looking at lineage. We're looking at, we can get compliance is it in Europe or the US and are there certain who that that that matters to. Why, gosh, I wish we all started with that one. Why are we storing this data, some more and more companies are realizing that one of my customers said I'm stealing it because it was brilliant it was you know some of this data is just toxic waste. Do we need to keep it. Do we need to keep all of our customers sensitive information, or is that too much of a risk. What's the value compared to the storage risk, not only in terms of, you know, the storage in terms of bits and bytes and volume, but in terms of risk of an auditability, or what's the business drivers let's focus you know one can focus on anything and and money touched on that that, you know, with these tools actually some of those tools have been around for decades now that you can scan the data in. I see that some of these tools are so slick it's really easy to get carried away and scan and everything or try to define every single business data element in the planet. Please don't do that. It's probably a whole lot better to focus on those data elements that are offering the most business value, and get those right get those in context and make it easy for people to find and then then move on. And then to governance that will touch on a little bit, the when super important when was it created, how long should it be stored when should be purged retention rules. And then the how I think it's kind of like the what you know the core, you know how is it formatted how many data stores etc etc so hopefully this is a helpful context of the data and context that you can kind of use and it is multifaceted. So some people, you know when they're thinking of metadata they think of the business definitions and they're right. Some people correctly think of it as the technical definitions are really in their right as well. So, rather than argue we sort of are inclusive and abrasive. So, what is metadata just a couple more examples, data versus metadata, this could be a fun party trick to some people's data is other people's metadata and again you can go crazy but if you just think of a database almost as a spreadsheet. This is totally simplify, and you had a simple spreadsheet of names and addresses and your company or retail store. The names the actual the fact that Joe Smith bought a computer computers are us in New York and purchased it 1970. That's the data, the metadata is that Joe represents the first name, Smith is the last name. It seems boring but it's something like a year is that the year it was purchased. Did they have computers background, or is that the year that Joe was born perhaps is that a metadata issue is it a data issue. I mean that's the type of stuff that really can make or break a business intelligence campaign. I will, I will use the soapbox to share some of my stories and of metadata issues and we could have a whole week, a whole week conference on metadata horror stories right now but I think that we all have them and what's so frustrating about them it's such a simple thing that you just put instead of year, you're born, you're purchased, year became a loyalty customer, and you know just write it down. Someone said once you know metadata is sort of a gift to yourself or gift to somebody else 10 years later. And that really is important so easy to do but so valuable when you do it. So there's often more even into this context you know see something as simple as, you know, last name, what is that is that the surname or family name but in some markets that actually is the other way around that what we call the last name is what in the West we call the first name, etc. Something like city is that the city where where Joe Smith lives or the city where the store is located or you know that we go crazy but just getting that simple context can be so valuable and that is metadata. I will focus a lot on the business value of metadata and kind of the business metadata. Sometimes that is the biggest value. One of the stories when I was young and one of my first big metadata implementations was one of the big Wall Street banks on Wall Street, and I had one of the biggest slickest tools of the time and I wanted to get all the cool technical that metadata in, in the project manager at the time said just build the glossary, you know define what a default credit swap is and seriously we've all I was I was really angry I remember storming out of the building ones and stomping my way home, because I wanted to do cool stuff but long time he was right because we got the buy-in from the business. All the traders went to the glossary because that's where they need to know all these different tools they had, and then later we linked it to everything and we had a full lineage and it was one of the actually the biggest metadata implementations of, I've ever done, and we got to the tech, but we got to the tech by starting with the business and that was kind of a good lesson for me to remember. The business course gets it more than it I often feel I have to argue with it to do it and business can't understand how you can work without it. Something and this was a survey from a diversity while back we did a survey of metadata and 80% of the users were from the business and they really saw that you know how can you function without it. I remember once I was doing the business case to our executive sponsor of the project and it was from finance, we went through the whole long presentation of why you need the context of data and why you need the lineage and how this report was calculated. And she looked at us funny and said, mean you're not doing that. That's really scary she said we couldn't get away with that and finance. I don't know where I money comes from we just have some in the bank. So I think now that data is becoming more of an asset, and people are realizing that it's always been an asset, but people are realizing we have to have more rigor around how we're managing it, which I think is a good thing. So again something as simple as if you think of the person earlier, something like how was total sales figures calculated. And that's why sometimes you do sound like a crazy person when you talk about metadata. So I went through a little tweet storm a while back of you know during the day what what did I do, you know I spent all day defining what a country was or what a flavor was or what a product was and then you know you come home at night so what did you do. I defined what what you know a region is and people think you're strange how complicated can that be. So something like total sales that seems on the surface so simple, but there's a lot of nuance to that show me customers by region. Now if you're a good data architect or metadata architect or business analyst etc. If you're like me, we've been around the block gosh some of those words right away can trigger you what's a customer what's a region. Does that include lapsed customers, you know, how do we define what's lapsed if they haven't paid their maintenance for six months is that lapsed. How do you find a region gosh that can be really political, what do we find a country, etc, etc, etc. So that's all that context around such a simple report and the business person can get frustrated gosh how hard can that be until you start explaining some of those nuance and then generally people jump right in and understand my my request to everyone on this call or anyone on the planet, please please avoid the what I call that I just know. And some of us again it seems so obvious so think of back to that that spreadsheet where it said, you know, name, date and location. Gosh, I don't I just know how hard is that it's the location of the store. Well, when you retire 20 years later, does someone know that that was the location of the store. So that's the location of the customer like that's that's a very big difference. And gosh I stay in business at our company, partly for that reason I mean that those were easy examples but how many times have you seen something of you just see you do data profiling and there's a funny field in there, and they say oh that XYZ that means that that's a prime customer, and we just knew that because that was the easiest way to show it. I'm supposed to know that two years from now, or this gentleman, you know part numbers what used to be called component number before that position well I just know that doesn't everyone just know that just write it down, put it in a business glossary metadata repository metadata catalog data model, you know, wherever is a lot of ways to store metadata but just document it because the that's the type of thing that causes multiple embarrassing issues and organizations. A little quote from my cartoon from my book that I like to show that isn't really funny at all but it's particularly not funny what until this has happened to you. Okay we're almost done with the acceptance testing we're running out the application. Just one question what's the customer. And again someone can say well how hard was that but anyone who's been on a project customer is one of the hardest things to define. And so many flavors of that. I worked for a major corporation that actually had one of these embarrassing items. The definition of is customer someone from marketing, we often use them and go talk to a customer, you really mean a prospect, or someone who already owns the product and sent out kind of services to people who didn't work customers and sent out marketing campaigns to people who were and got those two mixed up and it was very embarrassing, just because what is a customer right there's database it was the customer database it was really the marketing database big difference right. A little story I like to tell if you just puts it in context if you can't get these down, you're going to have a product so problem so just imagine you're all going to go on a family vacation. I have a definition of what a vacation is I think I for a while when when one used to go to an airport. There was a ad campaign from one of the big banks and it had just the only ad was heaven or hell, and it had all these different things like a cruise ship. One side was heaven one side was hell, or camping and one side heaven one side hell, and depending on you who you are. You could think either one of those is the most horrible thing so similar to a family right the father he wants to take time and then see every state park and learn everything across the country. You know mom just wants to read a book because she's been busy. You know Jane wants to go outside to all the state parks and exercise because she's been studying and Bobby doesn't want to be there at all because he wanted to go out and party with his friends, because he has an iron from in from the UK he's like what's this vacation talk we call it holiday. I just want to go to the pub, why am I even here and Donna, she doesn't care as long as she has her laptop and talk metadata right, but just think of the conflict in that car on the right that we couldn't even agree on what we want what what do we think a vacation is in all the context and especially some of the Americans on the call is a common thing to do is decide to drive across the country and see the conflict and you probably had some of those arguments in the back seat or front seat car, but just think of that on a project, we're doing a customer relationship management well what's a customer. What is a relationship, you know, so really getting to these core concepts early can save a lot of headache down the road, not only in life, but metadata as relationship management. It's not a counseling side business right, but just communication is always helpful. I'll go through a few examples quick quickly I guess again we can have a whole week seminar. This was sort of a highly fall heard about this one perhaps. I don't say popular was a well known one. NASA, we've all heard of NASA in 1999, they actually lost 125 million Mars orbiter because of a metadata issue. So when they can they sent the figures, and the data was in English pound seconds, instead of metric units or Newton seconds. Again, huge difference we're talking five of something or a million of something that can get as much if you don't know what the unit of that something is, you can get much off course so we know the cost of that one at a minimum was 125 million talking right, that's a big hole to start with. But just think of that brand and reputation damage damage right there. I guess I've been at probably six or seven conferences where I've heard this example, not a great way to have your brand name NASA, you know shown across the globe. But more importantly, think of all the lost opportunities you could have had for research on that Mars planet orbiter right so just think of things that would happen in your organization to give NASA some credit I don't like to throw names through the mud. They've actually gotten a lot better if you've ever looked at you know now in this new world. This idea of open data. There's so many open data sets out there that you can get great information, and they generally all come with really good metadata. So to give NASA credit they've learned from their lesson. And if you look through there's dozens and dozens of data sets out there, or more, and you'll see the units are documented. And everything they've learned their lesson is actually fairly good. So kudos to NASA for having great metadata. So this is a more story you'll be probably bored with my story. But this is a real world example where actually I was doing this for a webinar on self service data analytics and the point of the webinar was supposed to be how there's so many great open data sets you can use these new slick tools and really be a citizen data scientists and be a hero really quickly. So I picked this data set from a UK open data set. And I thought it would be fun it was vehicles by making model and accidents road safety and you know, whatever the car is those Porsche drivers that get in a lot of accidents, but this is the data I got back. What I don't know 50 F 13 is big and yeah, to 015 BS something was big and I mean it was it was almost comical. There's no way you can make sense of this because there's no metadata, not only no business metadata of what F 13 is. But if you look at even technical metadata. I can, I can use numbers 2015. There's a lot of that many 2015 that's probably a date here it looks sort of like an amount right so simple things made this data set completely unusual somebody spent a lot of time doing this. And now it's completely unusable it's almost the marketing for your data right if you don't have metadata no one knows what it is. Actually, long story short, this was coming full circle this was like five years ago, and we're actually working with this organization now to fix this very problem. I find that really fun. They maybe they are sort of one of our clients so unnamed until they fix it and then we'll show them a success story. Anyway, I could go on and on this is a kind of just to put this in all of these things that make you sound crazy like what is a year. Right there was an international retail chain, comparing fourth quarter sales, generally in fourth quarter you see a spike is in many areas that's the holiday season, but Latin America didn't see that that they actually. So that what can we do can we increase marketing what do we do you know all of these issues when it came down to it. They were talking fiscal year instead of calendar year. Again, such a simple thing of just define what a year is and again you can sound doesn't everybody know that we go by fiscal year and not calendar year. Just write it down and if everybody knows write it down anyway. That's the beauty of metadata. Finish this story give a little success story. They actually this fictitious company actually had great metadata and money talked a lot about the lineage. That can really help with this so we can look through and see how sales amount was calculated. We can see how which database was calculated what business rules were used and really get to that answer fairly quickly and that's the beauty of some of these tools is getting that context. So you may say, oh that's great Donna that's data warehousing that's so old school no one needs annual reports anymore. We all use big data analytics well even with big data you still need probably even more so you need good metadata so I would just say wherever you need good data, you need good metadata, and I will argue. I don't agree on that one. It just it makes so much sense. You almost don't need to explain it. It's just that that becomes complicated is more your technical and virus become complicated. So yet another story. This and all of these are sort of semi fictitional that come from clients that I just don't want to name, but big data analytics so a company that uses smart meters right and they actually did some analysis which is really nice they used smart meters for their homes and want to see how using a smart meter can to decrease energy use over time. So they saw that customers with smart meters used 5% of for each increase in temperature, they had a 5% increase in usage compared with 20% for people with old fast and thermostats, like the one on the bottom left. So how do they reach that how do you know what that even means until you get more metadata it's almost like all of the questions I have if you're a data person you probably have a lot of questions well how does you get that weather data. Was that a monthly reading daily reading was that in Celsius or Fahrenheit those the percentage point increase. How did you get those meter readings were they the billing meter readings or the actual media readings or data people should have a lot of questions right and was usage by household individual how did you do households is it by the people living in it by the address. Lots of different contexts around even those numbers so maybe those numbers are right. Maybe they're wrong in your viewpoint right or wrong is this context right that metadata in context. So maybe even a simple thing back to the NASA example is that Celsius or Fahrenheit and those are very different scales. So what do I mean a percentage point of temperature right. So even if we're doing big data analytics this was a classic IOT data analytics, you still need metadata or even more so you need metadata. We're on all of those stories but who uses metadata and we talked a bit about business users using metadata, but technical people need it as well, auditors need it. Almost anybody I would say this is a very democratic thing once it's there it will be used so it could be a developer. So once this field what else might be affected down the road I call that impact analysis. Again, you could say, you know, is that really needed, and I'm amazed even to this day, these major companies I work with that still make these very basic issues so two years ago, three years ago, major retail chain in the US, some well not well meaning not thinking a database administrator decided to be a good idea to shorten the length of the product number to save some space, you can just imagine that broke all the downstream down the system was down for a full day as they fix that and they could calculate how much revenue was lost because I couldn't people couldn't order product because of a data type length that was a complete issue right. I told you I wouldn't bore you too much with all these stories but each one of these has a real world horror story around or success story with people who do it well. And it, you know we talked about the business person how do we define regional sales. It could be a data architecture citizen data scientist. What is a lot of data out there what's the approved data set how should I store it. How do I, how do I map that across, or I'll just jump to the one on the right. We often forget, you're just a regular person. I joined a company who doesn't get confused by the acronyms or how the company terminology, often it's you know back to that Wall Street example I had just having that kind of common lingua franca of a company can be part of your metadata dictionary. So, lots of good usages governance is a key way to some metadata supports data governance. Again, often a metadata data governance task is a defining key KPIs and measures defining core terms defining lineage business rules. And so many of your typical users or stakeholders in governance, use metadata and generate metadata and it's certainly governance to manage your metadata well so it's again one of these nice virtual circles that is used by everybody and you really need all of these people to make it successful because you need to manage metadata, just like you manage data and not to do one of these you know data metadata is data about data, you know, circles but metadata really is data and you need to manage it in a similar way. So how do you manage that there's a lot of things that have evolved. I mean one way I look like to look at it and Monique had some helpful ways to kind of just, you know, think when I'm looking at a tool what are the type of features that I need, because there are a lot of tools, or even within tools they have different features that you may turn on and off one thing I like to think of and I'm actually facing this at a client now where they're having some conflict of is the type of metadata management you need more of the encyclopedia that I wrote, where it's, I have a definition of total sales, that's what's going on the annual report, and that is vetted we got it through governance, and it shall be published. And it's not that nobody can give feedback on it we want to hear if there's a difference, but it has to go through a fairly governed process because when you change the definition of total sales. There's a lot of impact to the organization and your annual reports and bonuses and etc etc versus something like I am doing some data exploration and we're doing some citizen data analytics, we want to be rapid and we want to just share hey I did a query that's really cool. What was your definition of this and what data sets did you use, I consider that more of the Wikipedia approach you're sort of, they may all be right by the end I kind of call that the eventual consistency right but by over time we get a lot of voices. Both are really valuable and you may need both use cases within your organization. Encyclopedia might be for your core data, your corporate reports, or your master data, not everybody gets a voice and in the corporate reports, or what your master data is and if you do have a voice it has to be vetted. In Wikipedia, you don't want to that might be again trying to do some self servicing analytics and or certain department. You don't want to lock that down so much that someone has to log an issue request, just to share query with someone else when you're working on it so you don't want to overdo it or under do it. I think in organizations as I said, there's a place for both, you just want to do it in the right spot. And again don't over govern stuff that's supposed to be more agile and don't under govern stuff that really should be very closely standardized and governed. So getting a little more into the technical and the how what makes metadata challenging is that there are so many data sources. So Monique mentioned that I mean some companies stay in business just by writing scanners or interfaces or collectors or whatever is their call for these systems. And it, I mean, we still are getting metadata from things like COBOL for from mainframe, but then there's also, you know, IOT streaming or now media files social media you can get Twitter metadata. And and how do I, but all of these may have customer information. So there's a business aspect but there is some genuine challenges and just getting that tech metadata out. This is from a metadata diversity survey. We do each year and you can just see kind of what's what's currently used in the organization, what's planned down the road. And yet there's a good portion of relational databases which are fairly simple at this point to scan in you can just get an ODBC type scanner. But there's a lot of other types of files now media files and different package applications that may kind of make their metadata difficult to get. So when you're doing your metadata strategy, you really need to think of that scope and breadth of the type of sources you have, and do some vetting of the metadata tool, your choosing. So it may be that you really do need to get a broad approach on this. And that's really maybe you're trying to do an API analysis or really doing that broad inventory of everything in the organization. So you need a tool that can can do a broad depth of this and vet that tool when you're doing a POC. It might be, but you're doing overkill and you really just have relational databases and you might even use your data modeling tool for your metadata, or, or SharePoint or really really just don't overdo it but again don't underdo it as well and you need to make that decision by the data sources you have. So on that note, there's no one size fits all with anything in life, but particularly metadata management or the tools, and this is just some things to think about. So there is a use case for the one on the left that sort of your whatever car you like your Mercedes or Tesla or your full service vehicle that really does everything. There's generally some sort of data catalog or metadata repository whatever word is used these days, but has some sort of what I call a meta model, it says, you know, for all of these systems here. How do I store metadata for for relational databases versus graph databases versus XML and JSON there's just different storage mechanisms for that and they do that hard work for you with these scanners and I don't belittle that's a big thing. So they can scan that in generally give it some thought of how they do that matching and reuse logic. I know back in the day that was really something you have to customize and give a lot of thought to like anything in life. Things have gotten simpler but in some ways that's a bad thing I sometimes ask vendors, well I've scanned the database in, and then I changed the table and I want to scan it back in. Do you match on the name of that column what if you've changed that name or do you have an ID or and they say oh don't worry about that we've got to handle what you should be wearing about because that's going to happen right so kick the tires a little bit. Some cases you don't want to know have to do all of these things manually but in some cases you absolutely do so I think of that. Most of these can sort of publish out in some sort of web based portal report and kind of share that so to me that's almost the one stop shopping of everything. So give it some thought that piece in the middle. Sometimes the tool you have has some metadata and I remember, again back in the day when I was only I worked for a vendor and I was sort of the metadata repository person that went around the globe doing metadata repositories and work with companies that literally spent millions on this repository, and when they were scanning in some relational database columns and putting a business definition on it. They could have done that with a data modeling tool. They were scanning in the data modeling tool that had all that information and publishing it out. Or could your ETL tool have just enough data lineage in it or your business glossary if you really just want to start with a business glossary maybe you start with a SharePoint. For now, just to get the buying and then look at a tool that's more expensive, or even I love to hate spreadsheets but even the spreadsheet is better than nothing so maybe the data dictionary in your database for now is good enough so you don't have to buy the big fancy thing and you can often scale up. And a lot of these vendors do realize that metadata is important and have some pretty good interfaces now. I also want to talk about the one on the right that often I think gets forgotten that often there are metadata exchanges or registries or standards across industries that hey we're all doing medical data can we have a common metadata exchange, which I think is a great thing or metadata registries, where you can share not only within organizations but across organizations which I think is a really great thing. So think of that as you as you plan. And some of the benefits of doing all that work. And again you may not need all of these so think of that when you're looking at a tool, the almost classic one that I called in the beginning was that idea of kind of that data lineage that often not always but often comes from a reporting scenario. So that my sales report on the right I want to know how sales was calculated and I have sort of on the top and it was a busy slide. All of the different things that say you wanted that true lineage would need to be scanned in or understood so what is the measure on the bi tool. What's the business definition the glossary of sales, and what kind of databases or cubes were were in the warehouse or semantic layer. What was the staging area and what are those tables look like what were the source status they have for status. They have a physical model, how what was the naming standards on that so you could go nuts and a lot of companies do it's super valuable, because when you do that business user or a technical user, someone says that number is wrong on that report, you click the button and these tools can see why it was quote wrong. Some cases that is wrong. There was an error in the etl. Some cases it's not wrong. People had a different definition in their head of what sales means or total sales means what's a region back to that beginning topic. One, there's a lot of I kind of call this relational metadata like the relation between other other things that you know you can have data about a table you can have data about a glossary and then how do you fit these things together. So one is through data lineage one is sort of the impact analysis. I showed that example before of the person in the retail company changing the name of the product field and bring down the systems. This could be the same thing. I'm going to change the name of brand. What else would be affected and you can look at that before you make the change. So this could be I have a measure what we're on a BI report what other reports use this measure, etc, etc, a lot of that kind of downstream or upstream, I guess, analysis. It can be this should be near and dear to Monique's heart, kind of that semantic layers and this is common in something like a data modeling suite but not only there, it could be between your glossary and the physical definition. So I have a concept called the client, but maybe in the logical model that's called customer, and then on the different databases that's cost its customer or C table one nine six two or whatever, but getting that lineage from the business semantics down to the technical implementation. So you have a lens to look at your metadata and again, a lot of the tools some of them have all of these some of them none of them, but think of your use case as you're looking at a tool. I think eventually you do need a tool for metadata. I could draw it on sticky notes but that would be kind of a limited thing so I am kind of talking a bit about tools. Graph relationships, right so metadata. In some ways graph that is the metadata. Often some of the benefits of a metadata was in the news for a while. One of my favorite quotes was from a friend of mine in Australia sent me a headline I think was a Sydney Daily News. And it said, Prime Minister of storms out of meeting for not being invited to metadata talks. I was like, man, that's, I didn't save the headline either I'm just killing me because what they were talking about that was if you remember back with cell phone and and people still use that right so how do I look at patterns of cell phone usage or fraud detection and things like that in a lot of ways that is metadata, not always a metadata we're thinking of in databases but that truly is metadata. And there's a lot of value to that so a lot of different ways you can use those pattern relationships. So Monique talked about a lot of the change in the industry I agree. AI is used a lot and a lot of things. And I think AI is great for machine learning or we misuse those terms a lot I know, but automating let's say that some of the things that back in the day or still people have to do manually. So, so it could be like some of the tools you have to say I want to see worse. I'm in the US and I want to see my social security number across all the different fields do I literally have to do a manual mapping that this field this is SSN links to this field. Can I do it by name, maybe, or what a lot of these machine learning things can look at and say, if I see three numbers two numbers four numbers. Probably that's an SSM or I can kind of do some pattern matching or fuzzy matching that can be really really great saves you a lot of time from from manual mapping, but but again like everything you want to place for both methods. And AI can't that you might just say, you know, we call our department name for HR, you know, XYZ department and no, there is no mapping for that that's something you defined right. And so you want to be able in some cases have your own mapping rules as well as the pattern matching approach and again, either one used in the wrong scenario can be bad you don't want to overmatch on things that may not make sense, and you want to have some control over that. I've heard the vendors say that to me as well. Oh, you don't need to know, we'll just do it with the magic, and that should always make you nervous so sometimes magic is good sometimes you do want to be more prescriptive. Um, so again, I won't go into all of these in detail they do want to give us some time for questions. I know this I can just see in the chat and q amp a looks like there's a bunch but um, again, you've heard me probably talk. In the name of my company of global data strategy, we do a lot of data strategies, but think of a metadata strategy when you're doing your day your implementation as well. And it's almost the cousin of a data strategy. What is the business goal in alignment of a metadata strategy and I've seen companies go wrong. It's so tempting with these tools. I'm walking with a client and I'll have scanned everything in and the metric they're using is we got 4000 systems and great what are you using it for and who's using it I don't know but we have 4000 systems and. And so I would say, probably heard me before what are those quick wins, what are those quick wins with metadata is it a glossary or is it lineage or what systems do you do linear john. What are the second column what are you storing. What systems do I need to scan in or, or manually put in through a glossary. How am I going to store it and then how am I going to publish it out and who's the audience right so I've seen. And you can just imagine some of these lineages that are word linear I can be really complicated, probably not what you want to show the CEO, if they say how do you calculate total sales that might be more of a glossary term so think of the different audiences and what display they are and then don't forget how are you going to govern that over time, who are the roles defining governance, what needs to be vetted monitored, what's more of that Wikipedia approach, and how do you manage quality and progress over time. So hopefully that was helpful I know we went a mile a minute but lost to cover with metadata. So again just in summary metadata data in context the who what where and why of your data. You need some sort of orchestration or data governance and metadata not only supports data governance but you need to govern metadata here I go in that circle to make metadata successful. Lots of different options out there today and with everything the more options the more confusion. So hopefully didn't cover everything in depth but give you enough things to think about if you're starting a metadata project or looking at a tool. Any questions to ask, because you do need metadata, big or small, if you're going to have a successful wider data strategy. So, as we open it up for questions just a reminder, do join us next month as we have our guest Nigel Turner and we talk about data quality. I did quote a few times this white paper that's available on data diversity as well as the global data strategy website for free download. So maybe do this for a living if you need help my blatant marketing plug. And now without further ado I will open it up to Shannon for questions and comments Shannon over to you. Donna thank you so much for another fantastic presentation and just to answer the most commonly asked questions just a reminder. I will send a follow up email to all registrants by end of day Monday with links to the slides and the recording along with anything else request and so diving in here. So questions coming will metadata automation enable identification of enterprise business activity context as the overall enterprise business architecture. A business activity is interesting. I think usage patterns can be a bit of metadata. I do think there is an overlap between. And again what one person calls metadata someone else calls something else that there's, you know, some of the security tools, or, tools that can kind of look at data passing across the network to say something like where our credit card information being used, and you can actually see that and Monique mentioned that as well kind of that data in motion can be a really interesting usage so you might have kind of that static metadata to say even what are those fields that are interesting to track, or even see usage. There might be one way these kind of network scanners that can say people are sending emails about their or, and or I'm not sure exactly the context of the question some people have basic metadata metrics of this is a bi report, and this is how many people are using this report and these the fields that are used the most and kind of do that usage patterns of data is another metadata because that's kind of the context around it so those are a couple examples that might be helpful. Okay, and do you push for class word inclusions in business titles. Yeah I think. Yeah, especially some of these data modeling tools or naming patterns to kind of have the class word, and at least have a pattern to it, how do you name things is it. I got sales revenue or seven years when you put that class word or a keyword ahead you put it after, and just have any gosh some of those can be metadata just think when you're organizing something something are we, you know, how are we pattern patterning the combinations of words. And if we could all agree on that that even just makes a lot of things easier so I would say that's kind of more in data standards but yeah that could be metadata as well of even the rules around how you do your your naming components so we're getting a little, little stretch but yeah I think that's very valuable just how we have those kind of naming patterns goes a long way. And I think we have time for this has been a couple more questions maybe. What are some of the unique considerations that medical research organizations should know about their metadata. Oh, there's a good one. Um, so we actually seem to be getting a lot more interest from those kind of companies later so privacy is a big one right so well knowing how these fields are tagged knowing the context and if you're doing a clinical trial that almost reminds me of some of the open data sets data in one context is a very different meaning in another context. Some context might be more static, you know this is a patient 42 years old male, and then then privacy is is huge, even if that that, you know you don't use persons name, and there's only one 42 year old male in this town or something that's not a great example but you know what I'm headed. And then there's also kind of the cross organization exchange, and I'm seeing some good progress and that as we share data across medical research is this another big area so that was kind of a couple things hopefully to keep in mind. I would add to that too definitely within a metadata management solution perspective that that's just a huge area of value. We, Donna mentioned earlier where she talked a little bit little bit about AI as well and one we definitely are seeing from the medical community, the need obviously to be able to to quickly and efficiently be able to tag data classifications that are sensitive such as HIPAA or just PII in general for a bunch of different protections. The metadata management solutions that are out on the market that are helping data stewards and data data governance teams doing this should be working down the path of using AI and other means to help at least discover within the data landscape. Where are those fields of technical met metadata that would apply and then again be able to show via the knowledge graphs and data lineage and other other discovery aids. These do apply there and then be able to help the data stewards use AI and other tools to really efficiently and quickly tag those based on their judgment whether or not it is a good fit and apply all of the policies and rules and so forth that should guide the usage of the data. I love it well that brings us right to the top of the hour thank you both so much for this great presentation and thanks to everyone for sponsoring today is a series webinar in the series and helping to make this these webinars happen. Thanks everybody for all the great engagement again I will send a follow up email by end of day Monday for this webinar with links to the slides and links to the recording. Thanks everybody I hope you all have a great day. Thanks all. Thanks Donna. Thank you.