 I sure can. Okay, good. And I'm just going to pause the record. Hello and welcome. My name is Mark Horseman and I am a data evangelist with Dataversity. We would like to thank you for joining today's Dataversity webinar, the importance of metadata, three leveraging strategies. It is the latest installment in the monthly series called Data Ed Online with Dr. Peter Akin. Just a couple of points to get us started due to the large number of people that attend these sessions, you will be muted during the webinar. For questions, we will be collecting them via the Q&A section. If you would like to chat with us or each other, we certainly encourage you to do so. And just to note the Zoom chat defaults to send to just the panelists, but you may absolutely switch that to network with everyone. To open the Q&A or the chat panel, you may find those icons in the bottom middle of your screen for those features. To answer the most commonly asked question, as always, we will send a follow-up email to all registrants within two business days containing links to the slides. And yes, we are recording and will likewise send a link to the recording of this session, as well as any additional information requested throughout the webinar. Now let me introduce to you our speaker for today, Dr. Peter Akin. Peter Akin, PhD, is an acknowledged data management authority and associate professor at Virginia Commonwealth University, president of DAMA International and associate director of the MIT International Society of Chief Data Officers. For more than 35 years, Peter has learned from working with hundreds of data management practices in 30 countries, including some of the world's most important. Among his 12 books are many firsts starting before Google, before data was big, and before data science, Peter founded several organizations that have helped more than 200 organizations leverage data-specific savings that have been measured at more than 1.5 billion US dollars. His latest is Anything Awesome. And with that, let me turn everything over to Peter to get today's webinar started. Hello and welcome. Hi, Mark. It's such a pleasure to talk with you and everybody else out there as well, as you know, well, metadata is one of my favorite subjects to talk on. And I just wanted to show you the breadth of the community that we had here. These were the folks that were on the last session if I took out the two big pieces. So we had quite a slice of the planet here that's listening and participating, and this is great. Then of course, we've got the two big giants that pop in there as well. So anyway, thanks so much for listening everybody and supporting and telling other folks about this is a useful place to go. And hopefully not everybody has to learn the same lessons over and over and over again. So importance of metadata around this, let's start out by talking about metadata in the context of data management. Then instead of three leveraging strategies, I'm going to give you four. The first one is that metadata is a gerund, not to treat it as a noun. The second one is to enforce metadata to be the language of data governance for precision, to treat glossaries and repositories as capabilities, as opposed to technologies. And the fourth one is to build your metadata from not from scratch, but from a series of building blocks that already exist and can help you in immeasurable ways. We'll finish up around here before we bring Mark back in to converse with you all about what sort of questions and answers that you want to get into on this. So we'll talk about specific benefits and applications around this as well in this in a couple of quick takeaways. So let's dive in. And if you don't recognize this as a card catalog, this was a particularly important one called the Mendenium. It was an institution aimed where they thought they were going to gather all of the world's knowledge and put it in there. So if you will, these were the first two people who were trying to become Google, perhaps, or something along those lines. And it's really where a lot of this thinking process started. Notice the date on here is 1913. So not a terribly long time on here. Wonderful set of stories I could tell you about all of this, but let's go a little further forward and talk about the word itself metadata. When we've had language evolution in the past, we tended to start out by saying, I guess we're going to put the words metadata together and we'll put a hyphen in between them. But over the passage of time, the hyphen is lost and we definitely need to lose it now. So the word is now metadata, all one word. And that's just interesting on the side. Somebody had copyrighted that term many, many moons ago, but it has not been enforced and unlikely to ever be enforced around that. So we're just going to talk about one word metadata on that. Data management is not also, I think, well understood outside of our immediate circle. And that is some people have a perception of us that we're librarians and we're just going to keep working our way through things. Or if we label everything correctly, we'll get it all right. And of course, neither of those are accurate. This was an actual Microsoft commercial that I'll just show you here. I'm not sure what Microsoft was trying to tell us, but I don't think they clearly understood what data management was either. And there's a reason for that and a very good reason for that. It's the story of the blind persons encountering the elephant from five different perspectives. Again, something that looks like one aspect and another, you can see all the various different parts. But in data, it's very much like that. If you come in through a different aspect of it, it's also going to look different to you. And if you don't know the totality of it, it's very difficult to understand some of the other aspects. Mainly from a, there's not a lot of, there are a lot of tools that have been developed that you can make use of. You don't have to keep reinventing the wheel over and over and over again. Let's talk about a definition of data management. We used to call it everything that happened between when data was acquired from the source until data is used. The problem is that doesn't give us anything visual or anything that we can really understand. And also it leaves off one of the more important aspects of this, which is that data management is about use, but preparing it for reuse. And that's really the key to it. If we don't get the reuse out of it, it's probably not exactly where we want to be in terms of optimization and things like that. So I've been working for a number of years with the folks and trying to come up with something that's basically two parts of data management. And that's about as much as most people can handle. There's a preparation and there's the exploitation of it. And you can see I've divided the left-hand side up into some specialized skills. But most importantly, I'm leaving lots and lots of space in this model for our formal reuse of data, because we have to actually focus on that. It doesn't happen just by chance. Of course, data is 80% preparation and exploitation and analysis 20%. And so people need to realize that. One of the things that I asked them is that when they show me a million dollars worth of expenditure of exploitation, I say, and where are we going to get the four million dollars that it's going to take us to actually come up with a proper one million dollars worth of value around all of this. All of this that we're talking about here has to be governed within an ethical framework of use and other types of things here. Again, moving towards the definition here, the prefix meta is rather useful here. And the American English dictionary gives us beyond transcending, more comprehensive. Good. I like all of that. Now, I'm going to explain a couple things out there to some of you are perhaps younger. One, a library was a place that you physically traveled to in order to access reference materials and do other things that you now do on your phones. But that's the way it was. And in order to organize that, this was the next step after that first card catalog that I showed you. You went to these card catalogs and looked up a subject. And it would show you what references were available to it on those three by five cards. And there were some wonderfully talented people who kept all of this organized for us. And then those cards would point us to a physical address of the library where it was. So we could use this function by saying, what books are in the library and where are they located by subject area, by author, by title, by again, different ways of looking at each of these things. And this would allow us to find the books on the shelf. Again, probably seems incomprehensible to those of you that were born with the internet. But that was the way we had to do it. And these practices were good and well refined. The whole concept of library science is critically important. And from that, you're probably okay if I say, webinars over, we're just going to call it data about data end of story. And of course, you know me that wouldn't let you off that easily. And try to say no. This is a good way to think about it. It's probably a definition that's a little bit fraught with some some challenges around there. And we'll get to those. So first thing is look in your organization for a place where metadata is managed already and show that this is not something new, that this is something that is already done. Somebody in your network department is tracing users and access points and trying to find out what devices are allowed to permit, where are they allowed to permit, what sort of responsibility somebody has for managing all of this in a 30 person networking group. Three people are doing this generally more or less full time, or at least designated to it. That's the rule of thumb that I found around this. So here we have an example of the organization already doing this stuff. In order to do a network and to run it securely, they need to know, are you allowed on or are you allowed off? Are you employing white listing or black listing? And where are you allowed to get on? By the way, BYO has been so much fun for everybody doing that. Let's take another concept around this, which is the concept of leverage. Now, leverage is an engineering concept as you can see on the screen. And the idea is that we have the ability with a smaller amount of force to be able to move something larger than us. So the 10 kilogram moves the 100 kilogram piece if it's got the right type of technology in order to do this. You add more weight to that end to keep everything else the same. You now can start to predict what's actually going to happen around that and determine how much better it could get. Or if you're silly, 100 kilogram ball could roll down and crush the loving kilogram ball. That would be the end of it. In order to obtain data leverage, you have to look at the same types of concept. You have your organizational data on one end of this. And we have some technology that we're employing into this. The first piece of technology is the lever itself. I could take just the lever itself and try to move the organizational data. And that would be better than trying to put my shoulder to it. But it would not be as efficient as if I employ the second half of the technology correctly and employ the fulcrum and the lever correctly. Now I can start to move things in order to do this. What's happening at the moment in our organizations is that all of our good knowledge workers are doing all of this lifting and they're all learning it individually. It's such a shame, such a waste of human capital. It is just mind boggling. Similarly, we don't have good processes around this, understanding that all of these things take people, process, and technology in order to do this. So we have a little bit of each of these. We now need a strategy. It says while we can't fix all of it at once, we can start to address it in bits and pieces. And by doing so, we can reduce the size of the data rot. A rot is data that is redundant, obsolete, or trivial. And we can reduce that size, the leverage becomes even greater. Metadata is this primary means of applying leverage to data. I already gave away the next slide on here, which is the idea of separating in all systems wheat from the chaff. We would like the good stuff and not so much of the bad stuff, and it tends to fall into the 80-20 category there as well. So one question people will ask you is, is well organized data worth more? Because that's what you're attempting to do here with metadata. And to point them to the correct answer, you can show them a pre-information age example that comes to us courtesy of the woman who I'm quoting here. Imagine if somebody just handed you a bunch of pages of a book and had not page numbered them or put the table of contents in or made the index in alphabetical order. Abby Covert has done a wonderful job of this book. Give her a plug on this. Again, just imagine if the pages were there, but somebody went to the trouble of removing the spine from those pages, how ephemeral that knowledge is, how much we spend in our organizations relearning that knowledge over and over and over again. Well, it's even worse than that. 80% of data is rot. That is data that is redundant, obsolete or trivial. The question of course is which data do we eliminate? And the truth is that most enterprise data is never analyzed, but you do need metadata to do valid identification of data assets to focus organizational attention on repairing common errors between different parts of the organization and permitting the value to be ascribed at the necessarily correct granular level, who is of course most qualified to do this? Well, it is the people in your organization that know the data best. Data leverage is a multi-use concept. It permits organizations to manage their data within the organization and with their exchange partners in support of the organizational mission. This leverage is enabled by metadata and to focus on the non-rot components of it, but it also requires technology and human skill sets. The bigger the organization, the bigger leverage potential exists. And by treating these data assets more asset-like, it will simultaneously reduce IT costs. 20 to 40% of all IT costs are tied up in unnecessary data movement and increased knowledge worker productivity. We'll get into that in just a little bit. So, metadata gives us, we'll do this little summary a couple more times in this session, some very specific answers about our data assets. Do we have these or not? We can answer definitively yes or no. What was the quality? Well, perhaps we have them, but they're not suitable. What will it cost to fix them? 35 cents a piece and we have how many billion of them? Again, whatever the questions are. And can these easily be provided with more granularity? In this case, we know the answer is not easily at this point. Metadata has always been a part of the data management body of knowledge. Quick little advert for it if you haven't seen it. You will see, first of all, we spelled it incorrectly, but more importantly, we've codified these around a series of practice areas and clearly metadata is one of them. Another way of envisioning the same thing is what we call the Q-card from this for the Dembach 2. So this was done by Chris Bradley in order to pull this together. Wonderful ways of articulating it. We're not going to go through each of them. They're great frameworks. We're looking at it. Instead, I'm going to give you a more practical application here. The app used to be called iTunes and now it's called Music on your iOS devices. There's an equivalent for Android. Somebody may want to put that in the chat just so that people can see it. And I'm going to, again, take you a little bit back. But in the old days, when you could put a CD, a compact disk into a computer, it could do a couple of things for you. And this is what it would show you originally. It would show you that there were 25 tracks. And it would also show you the length of each track. This was the metadata that was stored on the CD. It wasn't terribly useful. So the music app is trying to read the metadata 25 tracks and determine that each track is of a certain length. And that is the totality of the music. Now, what was interesting at this was that when you submitted the disk, it would go in and say, I'm going to go off and connect you to something called Grace Notes, which is a database that takes a fingerprint of each of the CDs and sends back the metadata for each of them. What a wonderful service it provided us for years and years. Information such as the CD name, the name of the artist, the track names, the genre, the artwork all came back in a couple of seconds, which seemed miraculous in those days. But I hope you're seeing the use of metadata here by finding a specific ID on this CD. It was able to come back, identify the CD, and say, this is the information that you were likely to look for. And it sure would have been a pain to type all this in. I'm pretty sure nobody would have done it in those days. So let's take this little example just a bit further. Suppose I want to make a playlist now containing Miles Davis. And so I could look at this and say, I'm going to organize my music library, creating a smart playlist. So anything with the word Miles Davis in it will fall into this playlist. Probably a couple of you are ahead of us already to say that, wow, I was surprised at the results. I didn't get the ones I thought. There are many more than 25. There are 32 songs, in fact, in this playlist. What was it? Oh, I forgot. I had another Miles Davis recording in there. So, you know, kind of neat to discover something that's on there, probably unexpected. But now I've got to go back and make this a little bit more of a specific playlist. So the artist contains Miles Davis, and the album is the complete birth of the cool, if I'm trying to zero in on that. Now I can get a specific Miles Davis, and I can organize these, literally in my case, thousands of songs, however you would like them to have. Similarly, though, if we look at this interface here, turns out that what Apple did and many other organizations, again, I'm just showing you the Apple example, because it's easy for me here, it takes the same information that you already know, the knowledge of the interface, how the data works with the program, the processing of that, the data structures that you're using, and applies it to podcasts, to movies, to books, these things go on PDF files. It's just a large way of gaining this leverage in your environment. So one core set of metadata puts out a lot of different examples around that. Again, do I have these specific Miles Davis recordings? I can answer yes or no with the metadata on this. What is my most, excuse me, my most frequently played song that gives me a very specific answer? Are there costs to acquire more of this class of data assets? Apple in this case charges you $1.29 each. Can I listen to the entire album before dinner? Probably not, because again, I can look at data to see what's actually happening in these areas. Let's get to our strategies now that I've talked about metadata in the context. Again, we had to define data management first, then use metadata as this opportunity to leverage our data. And I gave you a very specific example that you can now use to go train anybody on your team or better still have your team train others in this. First strategy is that metadata is adjourned, not to treat it as a noun. When people learn what metadata is, they tend to look around and start pointing to things and saying, is this metadata? That is the wrong way to think about it. When others do not understand what you are, you are perceived as a cost. This was some good advice I received at one point. But if others do understand what you do, then you can be perceived as a value. And this is even more difficult because metadata exists at this esoteric plane that is just not comfortable for most people, not comfortable able. They're not facile with the language in here. So let's start off with architectures. All organizations have architectures in order to become organizations. The question is not whether they have them, but whether what they have is useful and whether it's documented. Those are two important criteria. And if we want to get these ideas, we need to share them between the business concepts, the actual technical pieces of the system, and the systems themselves in order to use these. So the business, the technical users in the systems, if we don't have all three of these in synchronization, we have a lot of confusion that occurs. And this creation of a common vocabulary, the metadata that we're talking about, allows us to take those assets and move them in support of the system. There's all sorts of names for these things. Glossaries, I've put a bunch of them on the left there. The key is you're starting to use a controlled vocabulary that others may share. And that's really important. We could do an entire lecture on just that particular piece. But what you're going for is a trusted catalog as you're moving your way towards this. Interestingly, again, at the top, you heard that I worked a little bit of work for Walmart at one point, and they had a wonderful way of articulating this. They thought of metadata as surrounding their spark that they have on their logo. And hey, to you, Brad, if you're still out there at Walmart, had some wonderful years that they were doing, but looking at the combination of who, what, why, when, where, and how, as making up that, and that metadata for them was any combination of any circle and the data in the center that unlocks that value. And this was a very useful definition for them. And since it spun on their corporate logo, it made it very accessible to them, ingestible as well. Think of your inbox from a metadata perspective. You use metadata to manage your inbox. You've got subjects, priorities, if you see things that are, you know, important that are not usually yell at the people for doing it, for freaking you out. You know, where's it coming from? When is it received? And you use this metadata to filter out the unimportant stuff to get rid of the junk and make sure that you return the messages to your boss. Many of you maintain outlook rules that you'd like to bring with you from organization to organization in order to do this. Very, very good example of using metadata in use. And imagine how much more difficult email would be without that type of context. There's a lot of good definitions here. My colleague David Hay has one in his book on here, but I want to bring you down to a Gartner one in particular. Metadata unlocks the value of data and therefore requires management attention. That's, they listed that in 2011, and it hasn't really gone away since there. So the metadata management is the process of making sure that management is paying attention and using the leverage value of it. And they're in the metadata. But the key is that metadata isn't really a noun, as I mentioned at the top. It's one person's data could be another's metadata out there. And we dive into that in a little bit. It's really more of a verb because it represents the use of an existing fact rather than a data type itself. And this is really key because if you are constantly looking around and trying to decide whether something is metadata or not, you spend a lot more time on it. Whereas the question should actually be from an operative perspective. Is this metadata worth including in our metadata practices? And that gives us a much better attack on the use of this drive from a verb, but function is a noun is called a gerund. And again, this is not a lecture on English as far as that goes. But it describes the use of the data, not the type of the data. And that's the important differentiator. You're managing that data from a somewhat higher level of abstraction. In this case, I'll give you a real quick example of it. This was a paper that we put out many moons ago that showed that even reverse engineering new system was useful from a metadata perspective. And the story was very, very easy. Early on, we were getting three new people soft modules were modules of an ERP. So components, they were the called develop workforce, administer workforce and compensate employees. And you can see they comprise the vast majority of the next series of steps. And each of those decomposed into one or more series of steps. In this example, I've decomposed the one called administer workforce into its rudimentary pieces. And you can see that recruiting workforce is larger than manage competencies, which is managed larger than manage successions, et cetera, et cetera. All I'm measuring here were the number of individual instances that were under which was one measure of complexity, but it was very quickly able to be done. And more importantly, gave people a sense of confidence that when they looked at the administer workforce module, they would be spending most of their time looking at recruiting workforce. And if they got through recruiting workforce, the rest of the learning was downhill from there in order to understand that. So again, a very good use of metadata in there just to quickly highlight some of the things. There's again a one hour talk that will go on about that drone on and on. Here's a piece we used to do back in the old days before we knew it was illegal to post our students names out there without their information on it. And what I'm showing you here is a university that I was at and a system that provided back for this system. And there's the database from which this information was drawn in here. And gave you a very nice picture in this case. Here was the database. Just again, coincidentally, it was reverse engineered by one of our existing classes to obtain the metadata of this because the university was going through a replacement in this case. And you can see here, if I just give you a little bit of information, that the main difference between that entity in the upper left hand corner that I've circled and labeled parent and all the rest of it is that this parent had lots of children. And I've now given you the structure. There's a one to many relationship between S, D, D, M, the student database master and each of the children that are out there. And please don't try to read these. You're not supposed to. You can, but it's just very, very difficult to hurt your eyes. And my only reason again for teaching you this is because here is a really nicely laid out articulate data structure. Very easy to understand. Every entry in a student database master file may have multiples attached to it, although some of them have just exactly one to one or zero or one in order to give a little bit more refinement to the process. But the relationships are quite well known, quite well articulated. And this is a true story. We had a vendor propose a replacement system for us. And we asked them, would they send us a model of the replacement system? And they sent us literally this piece of paper here. And the fascinating piece was they never could tell us was it a object model? Was it a data model? Was it a process model? Nobody could read anything on it here. And this was describing the replacement system. Well, you can see this is an insane proposition. There's not a chance that anything built at this level of complexity could be compared to the original piece and does not give a good value in terms of what you're trying to accomplish from that perspective here. So just describing the replacement system from a metadata perspective allowed us to discount it entirely to say that we're not going to use this. Thank you. We walked away from this particular one. Turns out there was a good reason for that. Catch me at a conference and I'll tell you the proper story on that. IBM even went at one point to something called the application development cycle information model. And it's a wonderful set of all the metadata that would exist if you had nothing better to do than do metadata around the world for the rest of your life. These are good places, by the way, to start. Many of this, much of this exists online. But again, the question here is not, is this metadata? It's, would we obtain value if we included this in the scope of our metadata practices? And that's the key question for you. Let's go on to the next strategy then. Enforce metadata to be the language of data governance. Because so much occurs in data governance where people are misinterpreting and take so much time to get there. It's one of the frustrating aspects, the most frustrating aspects of all of this. Hold tight, guys. We're going to define three key terms, three important concepts here. First of all, I'm going to give you the number 42. If you recall the Hitchhiker's Guide to the Galaxy, you might remember that 42 is the meaning of life, the universe, and everything. And for those of you that haven't read the Hitchhiker's Guide to the Galaxy, you might be saying, I think Peter's lost his marbles. Let's take another instance of 42. It was Jackie Robinson's Jersey and the name of the movie that was his autobiography pick that was out there. So I've given you a different meaning to the number 42. And I'm giving you a third meaning to the number of 42. The question is Peter allowed to drink adult beverages with the chopper time and place? The answer would be, yes, in the state of Virginia. His current age is the number 42, plus in this case, 22 years. So yes, Peter is allowed to drink alcoholic beverages in the state. What I've done there is giving you three different facts that all happen to be labeled 42. Each of those facts has been paired with a different meaning, giving you, respectively, the meaning of life, Jackie Robinson's Jersey number, or is Peter allowed to consume adult beverages? This is what makes data the combination of a fact and the meaning. And I've given you three different pieces of data over there on the right that are all called 42. That's a challenge for us. Now, we also don't want to just look at data. We want to look at useful data. We have to be able to distinguish between data that is super useful to us and data that is less useful to us, the rot that I described before. Moving the next level up though in the hierarchy, information then is when you take useful data and pair it with a request for information. That is how we move from data to information. Unfortunately, too many people try to think this problem too hard, but it is really quite well signed up, summed up by saying here that you can have data without information, but you can't have information without data. And if you just remember it that way, you'll realize that the cost of managing these two concepts separately is much greater than the cost of managing them together. The next final level that we get to is intelligence. It's also being used as the wisdom or knowledge level. Believe it or not, I've been doing this for 35 years and I've changed that level out because that's what happens with the times people use different pieces, but they mean what do I do with information to actually be of use to the organization and how do I apply data to strategy? And that is where we look at strategic use. We look at what information we have and which of it has been made strategic and we find this is how we define intelligence out there. So again, I've given you a data structure here of three different concepts, data, information and intelligence. A very objective way to define, to describe them apart and to identify them as well. This is built on definitions we've used in this community since 1983, around us. Too many people spent too much time doing this. This model is a most basic form of metadata. If you adopt this, you can build the rest of your organization on top of this. This is a great place to start off by educating those of your organization that are working in the data governance area. Because the only purpose of data governance is to support data strategy. And the only purpose of data strategy is to support your organization strategy. So from a perspective, what do we do perspective? Data is used to support the organizational strategy. The data strategy is how data is better used to support organizational strategy. Data governance then is saying, what can the data assets do to better support strategy? We'd like to make those changes so that we know how well the data strategy is working. And we have at our disposal some data stewards that can allow us to play in this. These are mainly our resources that we have. So I always say that the data strategy should be expressed in terms of business goals, but also in terms of metadata. And the concept of both of those need to go into our communication to the data stewards so that they understand what's happening here, what we're trying to do. We want this data to be used to support this key objective, this key variable, whatever it is we're trying to get to in terms of our pieces here. But remember, this is a talk on metadata. So the key here is to make sure that all of the participants in this operation here are speaking the same common language, the same metadata. If they aren't, you're going to have confusion that occurs as a result. So just to sum up this quick section here, valuable information about your data governance assets and process comes from and in the form and must be maintained in metadata. First of all, do we have a shared understanding of our goals? That's probably a good place to start. Are we in IT focused on similar goals? You'd be amazed how many times I've walked into organizations and found misalignment there. How effective are we being now? Well, if it's something that's costing us two cents a piece, it doesn't sound like it's going to cost us a lot. That's a much better answer than a much more precise answer than not a lot. So again, keep that in mind as well. And what kind of metadata do we find most valuable? Right now we'll supply chain metadata happens to be at the top of the list. So again, you can see here, this metadata has to be the language of data governance. Otherwise, things get off track right away. It's the single one most variable that changing it can focus the efforts in data governance in a way that they've had trouble with before. The last, sorry, these are the last ones. It's now the third one. Repositories and glossaries should be treated as technology, excuse me, treat as capabilities, not technologies. Let me give you sort of the clinical definition. A technology says, give me a definition of a bed. And it says definition on it. Well, that is, in fact, a definition, sorry, a bed is a very simple noun that we can use as a definition of piece of furniture, use as a place to sleep or relax or recover, whatever it is that we need to do in terms of this. So it's a definition. But I want to urge you to go beyond this sort of clinical definitions. And remember, there's about data descriptions, which are really what data models are. All data models are incomplete without definition. So I've been shown many, many data models where people have a beautiful picture. But we need to see those definitions as well in order to understand this. Otherwise, it's not possible to do. Second, the purpose statements are generally better than definitions around all of this. And thirdly, all of what we just talked about is metadata. So let's dive in. Here's a purpose statement example. Now, this came from a Veterans Administration Hospital system back in the 90s that they were working on, but still a valid example. They had an entity called a bed. It was the principal data entity of the structure of a room within the substructure of a facility location, blah, blah, blah. The purpose statement, instead of defining it says, why is the organization maintaining information about this business concept? And the answer comes up that turns out we were also finding out that this hospital bed was going to get a tracker on it of some source. And this tracker was going to be able to use in addition to the existing attributes to find patients because patients have a tendency to become lost in hospitals. I'm sure about half of you say yes, we know. And the other half of you are waiting to find out about this. Think about this for just a quick second though. So this was sound like a pretty great idea. First of all, we get a partial list of these attributes, the description of the bed, the status, whether it had a gender that was going to be associated with it or not, the reason for the reservation. There's probably a lot of other attributes that you may need to have in order to do that. But if we're going to be using the bed to track the patient, we have to have questions that can be answered, particularly around what room would the hallway outside of my room be? And that apparently hadn't occurred to the architects of this plan. The other question that got them stumped to was what room is the elevator? And again, that's believe it or not, hallway is an elevator to the two main places that patients get lost in hospitals. That's not, by the way, very good metadata about hospital structures. I'm sure it'll scare you from wanting to go in a hospital there. Let's get back to our example of metadata here. All right. So I've given you a bunch of metadata at the top here. Now we've got some associations, a very crude drawing, but it says a room can have zero or many beds within it. And the status of all data models from a metadata perspective should be draft until it is affirmatively validated. You should write draft on your data models that describe this because if you don't, it will be problematic. Again, patients tracking beds turned out that purpose statement in there actually was a really good idea not to do this particular piece. Again, they've kind of forgotten the bed ID to go into that particular process. Let me tell you another story on a wonderful company, Nokia. There's a wonderful book Transforming Nokia written by one of their previous chairman that talks about their transformation as a tire and rubber company into consumer electronics, mobile phones, which is the time I met them. And just to get an idea of Finnish culture, which is just wonderful, 2% of the population of Finland speaks Swedish. So all Finns are bilingual in case they run into that 2% of the population, they can say hey to them. They wanted to play internationally on the international stage. They moved their headquarters to New York and all business meetings became and all business was conducted in English and there were a lot of words that were not well known. And it was just generally culturally bad to ask questions. So they had to overcome that and made it good instead to build a common vocabulary. So they used good Nokia culture to do this. And when an unfamiliar word was used in the middle of a meeting, they all access the NTB to see if there existed a golden definition. It was really wonderful. They were literally, you can see the muscle memory they would reach for their laptop, go straight to the NTB and say, hey, what does this word mean? If not, there was a quick vote in the work group to see whether they should include it for the next version of the NTB. And there was a group that met weekly that would review submissions. And literally all they did with this was push this back out onto the web in a single web page with definitions. It was such a wonderful set of technologies. It worked great for them as a common vocabulary for them and published weekly, they came out for us. Again, it was the Nokia term bank, a really wonderful set of metadata activities. A final real challenge that you have around trying to figure out how to use and how to interact with metadata is that you start to go out by a repository from somebody or by a glossary from somebody or make sure that your next product X includes a glossary. The problem is your level of perspective is relatively low on this, whereas the vendors is very high in this gap that you have this technology gap is very, very great. So I urge organizations to do it yourself, particularly in the first couple of years, wait on that seven figure investment in there. This stuff is not rocket science. This is all you need in order to develop a meta model that you can create and be able to use around various aspects of developing data models around this. I'm not going to walk you through this. This is a chapter of one of my books. If you want to copy, just ping me and I'll be glad to get it for you. But let me show you what you can do when you build your own glossary out of something like, oh my God, Microsoft Access. Yes, we did in fact build one of these for a customer on this. We were looking at a table called FT underscore T underscore a CCT and you can see it popped up that column name on the right hand side of that column. That's kind of useful just to be able to take a look at it. We could click on the button's entity domains return, get some different definitions in there, and I'll walk you through those just so that you can see what happens as well. So again, a very low tech repository on here. Here again, a different table FT underscore T underscore ABDF. All right. The column details tell us that the column name is actually A C T G underscore B A S underscore ID. And again, it's a character Dave will link the four tells us a little bit about it. If I'm interested in seeing whether this column shows up as a primary key or as a foreign key, I can click on those two buttons and I have for this example here in the primary key. Here is the same entity. Okay, so this column here is this used as a primary column. It's also used as a primary column in ABDF, TEDF and Victor Charlie and Bueno Peter. Okay, my alphabet's very quickly there anymore. Anyway, that's the primary key. What about foreign keys? Again, very easy to do. And finally, where else does it show up? Now my point here is this activity was basically involved creating a general purpose database of databases, right? Database characteristics. And again, I've given you all that there's a definition out there for it. But more importantly, then really just rereading in the DDL of the existing databases. So all of this came from a existing commercial office of software product. And we took this from the install scripts in order to do this. So a very, very useful way of gaining a lot of metadata. And that's what it's going to come down to. Your metadata is going to be very easy to grab like something like this, or it's going to require you to go through a formal reverse engineering process. Don't worry, there's a book out there on that as well. We are not going to cover that today. That would be really putting you guys to sleep. So do we have this specific class of data assets? I can look through an existing repository. Is this data item used elsewhere? No. In fact, I can say definitively as opposed to I don't know. What cost did we incur acquiring this 35 cents a piece? Okay, again, that's got to be contexted. So it's really kind of useless. And can these data assets be shared securely? Well, I've got a typo on my slide there. And the answer is of course, not easily. Again, the idea is not that you have to go out and build yourself a repository to do this, but you can track your own basic metadata needs and go back to this very large gap that I see between organizational customers and vendors. And it's so much better for you all to start by doing this yourself, as opposed to trying to start out with the vendors. You'll save yourself two years worth of rent on the tool, and you'll be two years smarter when you're talking to the vendors so you're much more likely to get the product that you're actually looking for in order to help with your organizational metadata management problems. All right here, let's get on with this. So repositories as capabilities, not as technologies. Finally, building metadata from building blocks. This stuff has been done just as anybody that does anything in research will tell you there is nothing new under the sun. There is a starting point for your project. Yes, we're going to have to pay attention to architecture. And remember, architecture is about understanding things. Those are going to be in our case data structures, the function of those things, what does thing one do, what does thing two do, and how do those things interact? Well, in this case, it's a silly interaction, but nevertheless, something that is relevant to our question here. How are components expressed in this way? Well, details are organized into larger components, and that gives us some intricacies. So when I grab the master customer record and want to do something with it in an organizational context, there are all kinds of implications because there is data governance around how a customer master record should be utilized at this particular organization. Again, details are organized into larger components, the larger components are organized in the models, and this is where dependencies start to become introduced into the process. I can't sell you something until you've actually registered as a customer in our system. Again, just to take a very simple example there. And the models are eventually organized into architectures that should be organized for a purposefulness. Architectures are explicitly created for purposefulness, and therefore metadata has a purposefulness as well. Attributes handle the intricacies, entities, and pretty close to a straight on with objects to introduce dependencies into that, and models introduce architectures. Sorry, I would normally go back. The last question at the bottom of the previous page was why don't we have a lot of good models of data architectures? Will this see answer why? This is a very big data architecture in this case, and all of this is metadata. This is a description of a hospital's health information systems, including their electronic medical record. You can see it's just very, very large, which is why you have to start managing this both from a complexity perspective, but also from a representational perspective. Again, we would never ask for this, we would call these wall charts, never ask for it to be printed out and put up anywhere. We always do it just to show it off at least once, but after that you stop using it that way. You instead query the database that you have that keeps track of this, the glossary, the enterprise repository, whatever it is you're going to call. And of course, all of this is metadata. That's just the very bottom line of this. So learning how to do the metadata is a good place to start. Metadata helps us with specific design patterns, and I have a wonderful little graphic here of Perth's 12 tallest buildings. I kept this as sort of a souvenir of my last meeting with Clive before he died around this, but Perth's all buildings have the loo, the restroom, the washroom, in one particular place on each floor of each building. Why would they not instead run a separate set of pipes to everybody's exact custom loo? Well, the answer is it's a lot cheaper to do it this way, and it also depends on gravity. So we're going to be careful with it. Well, the same thing, of course, is true of electrical wiring, HVAC floor plans, all the various things. In fact, in New York City, they are redoing streets because of the floor planning that occurs within that particular concept. Again, just having more people out dining in the streets has changed certain traffic patterns, both from a pedestrian traffic pattern as well from a vehicular traffic pattern. My point of digressing here is that these patterns also then mean that the 110th floor can be the same as the 109th floor, the same as the 108th floor. Now we're talking economies of scale around this. Whoops, I'm so sorry I did that backwards. My point here is that all of this works off of the same patterning, and we should do exactly the same for our metadata, just as every household follows the same basic blueprints of needing to have at one end of the house a water heater and maybe an extension to a sort of medium far off place. But we generally don't run pipes all over the place just for running pipes because it's expensive and dangerous. Again, I remember it depends on gravity. We want to make sure that we have that metadata well sold up into what we're attempting to pull together around this. There are a wonderful series of purchasable metadata models, and I don't mean this as anybody's trying to get money out of them. If you spent 150 bucks and bought Len Silverstone's data model resource book, I'm not suggesting that you do, but it would be money well spent. The main reason is if you can find copies of this physical copies of it, each one of them had in the back of it a CD, yes, back where we started off with iTunes, a CD in this case holding the actual universal data modeling patterns. It's a wonderful, wonderful resource I urge you to go out to eBay and grab your own copy of it if you think you might need to use. Again, I'll just give you a very specific example, a healthcare pharmacy's cash box reconciliation. There's a data model for that. Goodness gracious. Len knew which one of the three volumes to point me to in there. Again, my own book, XML and data management, there's a good section on metadata and metadata quality. All of us were inspired by Adrian Tannenbaum's original book on metadata solutions and then David Pei was the first person to actually put together the data modeling patterns as conventions of thought. A wonderful book, I just urge you to read it under any set of circumstances. In the XML book, I'll give you a generalizable model for maintaining metadata, which is that underlying model we were talking about before, where you have the metadata from any software system so we can put it all together. I mentioned before the IBM's AD cycle system. It's actually like I said, there's a lot of artifacts of this project left over out there and they are well worth looking into. They did a very good job of this. There's some academic research papers that probably aren't going to do much else and put you to sleep, but there's actually some really good models tucked within there. Lots and lots of good documentation. I'm giving you a reference or two at the bottom. There are a couple more in the final section of all of this. Let's not forget our friend semi-structured data. I do want to differentiate here because everybody comes in and says, I can take unstructured data and they never finish the sentence, but the implication is they can make it into structured metadata. There's all sorts of existing examples of this that exist out there in the world. Again, whether you have an XML schema in your existing software product or whether you're doing basic research, I want to start with the structure of a document, which as you might imagine the librarians solve thankfully for us 100 years ago with something called the Dublin Core. There are all sorts of resources to get started and out there, but most people like to start because they go, wow, I'm going to convert unstructured data into semi-structured data. Now, it doesn't work that way. If something's truly unstructured, you might as well nail Jell-O to the wall. The best you can hope for is that you can take semi-structured data and make it more structured. I don't know that I would ever say structured, but probably a better description of this activity is to integrate between non-tabular data and tabular data. That is all sorts of other things. There are going to be some really interesting developments coming out in the next couple of years that are going to enable us to integrate to a much better level than we've been able to before. Right now, you can go and pick this webinar up and you can right-click a couple of times and get the transcript of everything I've babbled about for this entire hour around this. Hopefully, you'll be smarter about metadata in the process, but don't get fooled by this. It's just not something that you can make magic, but can you take data that is partially structured and make it a little bit more obvious? Yes, you can and you should. That's a very different process than making unstructured data into structured data. The metadata patterns really give us a set of valuable starting foundations. Again, do we really have to create a pharmacy building system from scratch? The answer is no. Will the proposed software that somebody is suggesting for us fit? Well, that's a challenging question. It is now considered best practice to ask the vendor of that software package, whether it's in the cloud magically or not, to give you a valid data model of that, physical data model of that, or maybe even a logical, if they know the difference between the two, continue talking to them because you can determine whether this is going to fit or whether just moving everything into the cloud is going to be the disaster that has unfortunately been for a number of organizations. These industry best practices do exist. People now know they can't get away with it. Has anybody published a model for implementing GDPR yet? In this instance, not just yet. Well, there's a fair amount that goes into all of that. We get to the top of the hour. I'll do a little bit of wind up here of summarization, but you all think of a couple of stump to chump questions that Mark and I can dive into and see if we can come up with. Let's take it for just a little bit. This is a wonderful group that the electronic, excuse me, the group is called the Electronic Frontiers Foundation, but they put out a wonderful little set of things back when Barack Obama said, it's just metadata. Well, if it's just metadata, they know you rang a phone sex service at 2, 24 a.m. and spoke for 18 minutes, but they don't know what you talked about or that you called a suicide hotline or HIV testing service or gynecologist. Metadata matters and EFF has done a great job over the years of helping us with all of this. And thanks to John Perry Barlow for helping co-found the organization and have somebody at least looking out for us from a metadata perspective. I really don't ever let anybody say it's just metadata. In fact, it's a valuable business proposition. This was a company I was briefly affiliated with, but my friend learned it a little bit more work for them. Here's the Director of Security, but he turned me on to their business model, which I'm going to show you as a little bit of an animation. I remember these are all recorded and when you go back on YouTube, they all seem very clear. I haven't had a lot of blurring. So if your network connection, so I apologize, but it should come back out fine when you do the recording on this. Envera's business proposition was basically, you were company A and Envera was trying to sell to you. And in order for you to do business, you had to talk to company B and company C and company D and suppliers A and B. And then you had to get product data. You had to connect with the bank, the freight company, and manage supply delivery services and other sorts of things. And this is the way they articulated their business value in a little cute animation here. So again, company B, you talk to them, you go back and forth. That's your supply chain. There's third-party stuff. All these various different types of documents have to go back and forth in order to do that. And they go back, of course, via phone, fax, email, or electronic data interchange, but it's still an awful lot of things. And that's just company B. We know it's the same thing with company C. We know it's the same thing with company D, et cetera, et cetera, et cetera. It's a lot of complexity. Again, a lot of friction in our operations that we can reduce in order to make it clearer. And people just love this particular articulation of it. They were going to replace all that with a company that was going to handle all that metadata in Envera. And unfortunately, Envera got sold to somebody else who didn't understand what they got. And it now sits in the basement. I'm told it's in San Francisco somewhere. But what a waste of connections, because all they had to do is connect up one type of thing. And it was 100% metadata base. That was Warren's actual email to me, hey, Peter, I want to see a company that's 100% metadata. Come take a look at this. We had a lot of fun with that. Another little bit of articulation around Envera as well was that they understood the value of this metadata, at least until they were sold to somebody who didn't understand it. So these metadata skills and value and patterns were reusable across many different domains. They were able to use it in a number of different ways and able to hold all of that. Now I'm going to let you read the FIPA or Open Government Data Act. I'm just kidding. Don't watch the left-hand side of your screen. Instead, pay attention to the fact that the general piece of this was that all federal data is now open by default. As opposed to some federal data was opened by the agency, now it's all open. That's a sea change for the federal agencies that are using this and they are responding extraordinarily well. That they are required to appoint non-political chief data officers. That they're required to use open data and open models in policy evaluation. And that if you mess up on this stuff, the penalties are higher than HIPAA. And almost any meeting I'm in, everybody goes, that's bad. Well, what it mandates is best practices using a glossary or a term bank or a dictionary and controlled vocabularies. There's even a section of it where the federal government tells three agencies to work as one integrated data management agency because they interchange so much data, it's really a bad idea to manage those things separately. A very good piece of legislation, something that I believe is pushing federal data management, both spending and practices to levels that we haven't seen in a number of years. Very, very interesting to go out there. So we're getting close to the end here, folks. Let's just do a quick review. I defined metadata in the context of data management, by defining data management as including both create and munch type of activities and exploit type of activities. And that there's a 2080 reference between the two of the, sorry, 8020, I should say. And talking about using data as metadata is really leverage around that. But remember, again, it's the concept of should we include this potential metadata as metadata in our solution because it has value, not because it's metadata. And I'm giving you a very specific teachable example using the music app on your iOS device. By the way, it works the same with the phones as well as the computers as well as the tablets. Again, strategies for them, not three. Strategy number one, metadata is a gerund. Do not treat it as a noun. Enforce metadata to be the language of data governance. You'll be amazed at the increase in focus that it brings treat glossaries and repositories as capabilities, not technologies, build from building blocks. There are many, many, many resources available to you in order to do that. And we're just about at the top of the hour. Again, we've got some values in here reduces training costs, assist spin analysis, bridging the gap between it and users, all of these just copy them and use them in your justifications around all of that. So data about data unlocks the value of data, less about what and more about how the language of data governance defines the essence of correctly specifying most organizational challenges and should we include it. Lots of examples. It is three o'clock and time to turn it back over to Mark. Mark, are you still there? You betcha. That was a fantastic presentation, Peter. Thank you so much. We do have some wonderful questions in the Q&A that we can get to. We'll start with this one. The problem is that metadata is so immensely wide a concept covering from the semantics of defining the core business to attribute level bits and pieces, running over enterprise architecture and so on. How do you recommend avoiding scope creep in a metadata development initiative? Excellent question. Really good question because as the questioner was right, and by the way, questioner, if you do have a reference for that, drop it in the chat for us and we'll make sure everybody gets that because there is a next level down. I've given you again several examples. Again, let me just go back here. A couple that you can see there's some good books, but I know that you're looking specifically at an answer to a very specific question. I think it would be of interest to everybody else. So by all means, if you've got that as a preference, drop it in. We can meet up maybe at the Enterprise Data World Conference or something in exchange about it, but a really good question. So the scope creep is what tends to happen in most exercises. You train an executive on what metadata is. The executive starts to run around and see potential metadata leverage in a number of different data items and starts pointing to things and saying, is that metadata put it in the repository? Is that metadata put it in the repository? Is that metadata put it in the repository? Well, the question is, what value has been accomplished? First of all, you will never complete the inventory of all of your data. You can just take that away as a guarantee. Peter says nobody's ever completed one in the history of the world. All right. Actually, I did have one group came up who we found out pretty much quickly. They were wrong, but they were nevertheless close, which was interesting. Most groups are not nearly that excited. And there's no reason. If 80% of your data is wrought, then that should not be showing up as part of your metadata. There is no point in using it. So that's one way of scraping, creeping, scope creeping on that. Lots of other ways of looking out for it. But the question that has to come down to, and this is why I think it's a good way to think about this. If I spend the dollars and cents, the money required to manage this metadata, whether it's, regardless of what it is, will that help the organization? Now, let me give you a very specific example. Walmart, again, I said I spent a bit of time there. They had a strategy that was every day low price. Just four simple words, every day low price. And when an associate looked at doing something at Walmart, they looked up, that was written all over the place, they looked up and said, okay, is this going to help with every day low price? Or is this going to hurt with every day low price? And they made their decision based on that strategic guidance. Similarly, if you've got a value proposition, you're saying that managing metadata should be better for us, then this should contribute some measurable quality towards that data. And again, two places to start, have you done the 80-20 slice of rot or not, right? And if you haven't done that slice yet, you shouldn't go proceeding any further. And then once you've gotten down to that 20% that's good stuff in there, let's see what value that provides. Doesn't mean just because it's in the 20%, we should keep it, but if it provides value to the organization leveraging its data, absolutely keep it in there. Great question. Thanks for sending it in. Excellent. We've got another excellent question here. What are some concrete ways you've integrated metadata slash data management in a big company with a fast growing data ecosystem? I'm afraid I'm going to have to confess that this company had a evolving ecosystem rather than a fast growing system. But I want to tell you about it anyway, because I think it's more realistic. If you're really growing rapidly, you need to think a bit. Now growing, of course, means a lot of different things, just like 42 means a bunch of different things. So I could be completely misinterpreting the question. If growing in volume, but the data structures are staying the same, then you have a capacity problem and you can handle that with engineering typically in that. But the question of concrete examples of moving into operations is a really excellent one. I'm going to go back to the example that I started to show, this is the one screenshot of, but it lets you all extrapolate a little bit on this here. So this is an implementation of a big organization-wide ERP. Not surprisingly, most people are completely unfamiliar with the ERP, and so they're trying to figure out what are sources that they can go to get this information. Remember what I added up here? I added up the number of things in each of these subcategories. So it's like it's reading a table of contents. Okay, now these were actually sub modules and procedures and things, but again, they were not necessarily representative, but I think most HR professionals, when you ask them where they spend most of their time, they will save recruiting and then managing and then sort of everything else. And not necessarily that this complemented or corresponded to that, but it was concrete use. So interestingly on this product, excuse me, project, we actually had a product workhouse where the group was set up literally right next to the project manager, and the project manager kept everything right next door because he was the most frequent user of our metadata that we had for all of this workforce stuff. And he would come over and ask for specific analyses to be run. People would come over to the security professional came over and said, let me take a look at what you've got and said, well, it looks to me like when I'm developing training plans, these are going to give me some guidance into what I should look into around training around this. The training administrator came over to look at this, the people who were doing the data conversion came over and looked at this. We just had an office devoted to maintaining metadata. And this office was very easily seen because there would be people out the door, waiting to answer, have questions answered about this. That was a wonderful example. We did write it up in a little bit. Again, just give you that reference there. All right, Mark, next question. Thanks for that. Excellent. Actually, I got a question DM'd to me that that I kind of was hoping you would answer. What did you quote as a rule of thumb for how many people to manage how much data? Good question. Sorry if I went through that quickly, but I'm glad you caught it. If you go to a large organization that has a 30 person or greater networking group, there's 30 people employed. As you might imagine, VCU has a lot of people employed, our networking group, I believe, is about 30 people and they manage almost 50,000 users. So just to give you some examples around that, you will find in that networking group, probably three people spending at least part-time and one spending more or less full-time on managing the locations of the network where it's permitted to get on, the who's allowed to get on, who's allowed to get off, what types of rules, and other aspects of things. This is pure and simple metadata management. And your organization already knows how to do it. So go to these folks, highlight what they're doing, get them to come out and give some people some pointers and things. Look at the technology that they've adopted and then make your first implementation part of building on what they're doing. You'd be amazed at the synergies that exist with them there. Again, great question. I'm glad somebody was able to do that. I don't even know what a DM is. I know it means direct message mark, but beyond that, I'm not sure how I would do it. Awesome. What are some of the advantages of understanding and using metadata in data analytics, especially household surveys or insurance data analytics? Fantastic question. And I did skip through that part relatively quickly. So one of the reasons is because you guys get all these slides, so you'll be able to go back and review it and use it and whatever make it applicable. And hopefully improve it too. I've had a lot of people suggest things. I've given credit at least on two of these slides and here to other people who created them and came up with them that were really, really good suggestions around that. So the question is, when you're looking at this from an analytics perspective, there's a lot of things that can go forward. I hit the button. There we go. I did go through these very, very quickly. Let me just pop some back up. When somebody's showing up as an analytics resource in an organization, they've been hired as a data scientist of one form or another. It takes them usually three full years to become useful to the organization. And that's a shame. And the reason is because they also have to go through the process of figuring out what the organization does, how data relates to what it does, and how they are supposed to use data to improve things. They don't come in. And what we teach them in schools is that we give them perfect data sets and they go in and they understand the world isn't neat at it. Just run it through and they can cluster, run the cluster algorithms on it and it works great. So the key to using this properly is to spend time putting your data in order. There's a new sort of newish field out called data products. It can help out with these areas. It's one thing if your organization has already adopted it that can be helpful looking at that. The key is when these folks come on board, you want to say, hey, welcome to what's going on. I'm going to put us in a liquor store context. Here's where we store the liquor. Here's where we keep the inventory levels. Here's where we keep the master product levels. Here's where we keep the distribution information. Here's where we get the consumer information. Here's the calendar so you can see what sorts of holidays are coming up. Then you can get started and do the things that you need to do. But if you just drop a very smart individual with an advanced degree in data algorithms and analytics into an organization without that type of structure to help them figure out, they are at the bottom of the pyramid that I showed before, which is the data information and wisdom pyramid. Again, I'm talking slowly because I'm trying to wind my way back to the top of it. A little trick if you know how to know what slide it is, you can actually go directly to it by typing in those numbers. But I can't read those numbers. They're too small. So where is it? Hang on, Mark. Give me just a quick second. That did not make such a big deal about it. I got to actually do it. One of those things, you do some of these things a week to figure out. I must have used that slide. We'll just do a search. This slide is the slide I'm talking about. Again, if you just drop them in there and tell them they're supposed to figure out what information they use strategically for the organization and you haven't figured out what facts and meaning comprise the useful data that becomes information for your organization, you're spending money where they're sitting around with their thumbs sucking. There's just not a whole lot going on in that type of a context. That's the most important thing that you can do. After that, you can start with good old business classes. Take these individuals and show them how strategic planning works. Show them these data governance exercises that you're running the groups through and what they're trying to figure out. Let them see how the organization actually makes its money in order to do this because so many of them are abstracted from the process. A situation that I've run into on multiple occasions is a data scientist who is doing a wonderful job of things. They've tweaked an algorithm for a couple of years and they've reported out that they've gotten, let's just say, 72 percent. Turns out if they'd actually reported out earlier, they would have made money earlier, $5 million a year for three years in order to do that. It's just crazy in some cases to see this. Those would be the two areas I would get them started about for sure. Again, thanks, Lars. Thanks for the great questions. Yeah. We just have the one question left before we call her a day. This could be another slide search for you as we had several questions and you did have a slide up for this. Which books do you recommend to address that would help people or address the metadata model? It depends on what you're trying to do. Go ahead, please. Yeah, you had a bunch of ISBN numbers up too, which as an avid reader always excites me. But you did have a slide of book covers as well. So our good friend, Mr. Hoberman, gets mad at us if we don't pop up the Peter's book slide and get those up for a second so that it can get through. No, there's a lot of, those are not necessarily about metadata. So these four are good places to start from a pattern's perspective. Adrian Tannenbaum did really the first book in metadata repositories that was out there. David Hayes' book that's there and again, Lynn Silverstone's data model book, excuse me, resource book. He's got three volumes of it. And so you want to figure out which one of the volumes or grab all three of them if you want to go that way. It's worth it to get the set, particularly if you're actually doing the work as many of us are and those models turn out to be extraordinarily handy. Yeah. And personally, I've enjoyed both of your book, the purple one up in the corner there or at the top and Lynn Silverstone's book. I've read both of those and enjoyed them both thoroughly. Absolutely. Lynn's a great friend. All right. Well, thank you, Peter. We've got them all done. All right. We've got them all done. Thank you, Peter, for this great presentation and the Q&A. That's all we have time for today. Just to remind everyone, we will be posting the recorded webinar and slides to the dataversity.net website within two business days. And we will send out a follow-up email to let you know the links and other requested information. Thank you again, everyone for attending today's webinar. And I hope you have a wonderful day. Thank you very much. Appreciate it. We'll see you at, see all of you guys at Enterprise Data World in September. Please come and see us at Enterprise Data World in September. Yep.