 And welcome, my name is Shannon Kemp, and I'm the Chief Digital Manager of DataVersity. I would like to thank you for joining this DataVersity webinar, Metadata Management Further the Governance Minded, sponsored today by Octopi. Just a couple of points to get us started, due to a large number of people that attend these sessions, he will be muted during the webinar. For questions, we will be collecting them via the Q&A in the bottom right-hand corner of your screen. Or, if you'd like to tweet, we encourage you to share our highlights or questions via Twitter using hashtag DataVersity. If you'd like to chat with us or with each other, we certainly encourage you to do so. Just click the chat icon in the bottom middle of your screen for that feature. And if you'd like to continue the conversation after the webinar, you may do so at community.dataversity.net. And as always, we will send a follow-up email within two business days containing links to the slides, the recording of this session, and additional information requested throughout the webinar. Now, let me introduce to you our speakers for today, Gal Alon and Bob Siner. Gal is a senior director of business development at Octopi. She has over 10 years of leadership experience in technology companies. Before joining Octopi, she led sales efforts at companies like Elusive Networks and Team 8 and also served as an intelligence officer in the Israeli Central Signal Intelligence Unit 8200. Gal holds a BSC in physics from Israel's Technion Institute of Technology. Many of you know Bob from his monthly real-world data governance series. Of course, Bob is the president and principal of KIK Consulting and Educational Services and the publisher of the data administration newsletter, tdan.com. Bob was awarded the Dean of Professional Award for significant and demonstrable contributions to the data management industry. Bob specializes in non-invasive data governance, data stewardship, and metadata management solutions. And with that, let me turn the webinar over to Bob and Gal to get today's webinar started. Hello and welcome. Hi, Shannon. Hi, Gal. Hi, everybody. Really happy to have you all with us today. This is a special webinar. This is a subject that's very near and dear to me. One of the things that I don't often talk about during the webinars that Shannon had mentioned is my background in metadata management. Metadata management is extremely critical in my background and my upbringing in the data management field. And I'm going to share a quick story with you about how that started in metadata management and how that even led me into data governance and the importance of metadata as being one of those subjects that is really a core backbone component piece to a successful data governance program. And the idea that the name of the webinar is metadata management for the governance minded, but we've inserted the word automation because that's really an important facet of the conversation that we're going to have today. And some of the things that you'll find when Occupy shares with you some of the components of their tool, how automation is really important to the success of data governance, especially if you're attempting to stay non-invasive in your approach to governance. Being automated in the way that we're collecting and analyzing our metadata becomes extremely important to us to get the value out of our metadata tool, but also to get the value out of data governance, getting people to become formally accountable for how they're managing the definition, the production, and the use of data. I often use the expression that the metadata will not govern itself. But the fact is there are a lot of things that we can have the metadata do in an automated fashion that's going to help us to become more and more successful with our governance initiative. So the topics that we're going to talk about today, the first one is kind of a no brainer for a lot of us who focus on data governance that really data governance is very dependent on the quality metadata that we're providing in the tool and tools to people to help them to understand the data, help them to understand the quality of the data, where it came from, how to protect the data, all of those things that we're trying to achieve with our data governance program. Really data governance is very dependent on quality metadata. And then we're going to talk about how the automation of the collection and the management of the metadata is such an important component of a successful metadata program to support our data governance program. Then we'll spend a little bit of time talking about what to look for in a metadata management tool and how it will really help us to be successful with our governance initiative. And how a great tool will really lead to increased use of your metadata assets within the organization. And last, after a quick demo of the Octopi tool, we'll talk about people that will benefit from having improved metadata automation and delivery within the organization. So the first subject is data governance dependency on quality metadata. And I learned that way back when, when I got started in the data management industry, in fact, I was a metadata repository administrator before I was a data governance administrator for the organization that I worked with, one of the larger healthcare, health insurance companies in the country. And one of the things that I found that in order to be able to successfully govern our data, we needed to have information about the data. And the definition that everybody or a lot of people use for metadata is that it's data about the data. It's very difficult to manage anything in your organization if you don't have quality metadata about, or quality information about that thing. So you can't manage people and facilities and your data assets if you don't have good information about the data within your organization. And certainly, if you're planning on following the approach that I talk about all the time, the non-invasive data governance approach, the automation of the process of not only retrieving and collecting the metadata, but of analyzing the metadata and putting it into a format that's going to be very digestible and usable to the organization, and that's a key. And I found, at first hand, when I was working back in the health insurance company, that building the interfaces to port metadata into our tool was not enough. We needed to make certain that that was happening automatically. And if we're going to be non-invasive and we're not going to give people a lot of additional tasks over what they're doing, we're certainly going to want to collect metadata into our metadata repository tool in as much of an automated fashion as we can. And anytime that it requires manual tasks from people to kick off jobs and things like that to start to port metadata into your tool, it requires formal accountability. Somebody is going to be responsible for these things. And we need to recognize that if we're going to try to make it as seamless to people in the organization as we can, that if we can apply that automation to bringing the metadata in and analyzing it, it's going to cause less work for people, but they're going to have more value because the metadata is going to continue to be more consistent and more complete just by the fact that we're automating the process to get the metadata into the tool. So anytime you have a manual task requires some type of formal accountability and somebody that has the responsibility for doing that. The metadata improves everything when it comes to your data. If you're going to improve understanding of data, certainly we need to provide definition. If you're going to provide lineage for data, we need to be able to demonstrate where the data came from. We need to be able to collect things like who the owners of the data are or the domain stewards or subject matter experts for the stewards are. So if we want to improve data's understanding and get people to use the data, I always talk about how the return on investment from governance and metadata really focuses on return on investment from the other tools in our environments, the other places that we're making investments. So if you want people to use the data and you want them to approve the value that they're getting from the data, the focus needs to be on providing that quality metadata to people. And as I mentioned, and I say it often, and I certainly stated in the non-invasive metadata governance course that I provide through Data Diversity, but I say that the metadata is not going to govern itself. And in fact, at a Data Diversity conference recently, I wrote that one line on a whiteboard in the front of the room and I kept pointing back to it. Because if we're going to include metadata as an important component of improving the value and the use of the data, we need to recognize that the metadata won't govern itself, but we need to be thinking about, well, what does it take to get the metadata into a tool that's going to make it useful to people? So it's going to require some intervention, but if we can automate that process, it's certainly going to decrease the amount of manual effort that goes into managing the metadata for our environment. And if you've attended webinars that I've presented in the past or if you've been at conference presentations that I've given, I share a lot of tools and templates. And the information that's stored in these tools and templates is the metadata. It's about who does what with the data across the organization, who gets involved when in the different aspects of governing data or operationalizing data governance within the organization. So all that information, even the information about the people that have responsibility for the data, is an important piece of metadata that we need to manage within our organization. And so now what I want to do is I want to turn it over to GAL for a couple of minutes to talk about where metadata management fits into the overall data governance spectrum. Yes, hi, everyone. Hello, I want to take this opportunity to say thank you for that adversity for joining us and of course for Bob. I'll say that as the preparation for this webinar, we've had numerous conversations and we were so excited to discover that what we do inherently sits very well with Bob's non-invasive approach. And I think that's what it's all about. Data governance is a complicated topic. It's something that we need to do in the right way. And whenever we can do it, whenever we can simplify it, make it more easy, and non-invasive is possible, I think that's a win for all of us. So starting on my first topic of how automation metadata collection and management process, I want to tell you that I'm here to say that metadata management, especially automated metadata management, the way we see it, has a unique role in kick-starting data governance projects and program. And brought here the very familiar demo circle that you've all known and I'm sure that I'm probably the 10,000 person who have used this in a slide. But the way we used to look at this circle is that how data governance actually implies to different aspects of data management, which metadata management is, of course, one of them. But what we've been seeing, and we have about four years now working with serving BI groups and serving BIT, helping them tracking their data, helping them instantly getting lineage, and helping them instantly creating a collaborative language around what they have in their environment, what we've learned working with them is that those core capability are actually core to a lot of other data governance projects. And if you use the same metadata automation capabilities that we use there, you can actually boost or springboard, if you will, any data governance and accelerate it. And with that, I want to move to our next slide and talk a little bit about more in practice, you can use this type of automation to boost your governance. And maybe I'll start with a story. And our team was in DJQ a few weeks back. I think Bob was there as well, and I'm sure many of you as well. And aside for coming back with a 10, what they came back with is a very strong notion that automation was never as needed as it is for governance. And the reason for that, we have delivered a keynote for about 200 to 250 governance leaders in the room. And the topic was business glossaries. And we can use this obviously as a microcosm for other data governance projects. And we were talking about how to implement business glossaries and we deep dive into why is it difficult to get through the initial phase of even getting the initial list or it's not initial, the list that you have all the business terms you have in your organization across all of your reporting, all of your reports and all of your reporting tools. And when we talked with the crowd, we asked why is it so complicated? And I'm sure that this is very familiar. It turns out that this process is completely manual and most organizations have anywhere between four to six different reporting tools. So sometimes just to get to the point when you're at the core of the governance project and the core of the governance project is not to manually map all the business terms. The core of the governance project is to apply logic on that, is to do storage shape, is to collaborate around it and decide what's right to do the business. But in order to get that, you have to invest hundreds, if not thousands of hours in mapping the existing business terms that you have. And what we asked the audience is, what if we can avoid it? What if we can jump ahead and start our governance project at the point that everything, the existing situation is mapped? So we have this ability into our environment instantaneously. And we believe this is something that is absolutely achievable. And in general, the way to achieve it, you basically need to automate three different capabilities. The first one is automatically extract and centralize all the metadata from across all of your environment regardless of the layers, the logical layer, the physical layer, if we're talking reporting, physical semantic and presentation layer, or if we're talking ETLs, or if we're talking databases, everything needs to be extracted. The second thing, extracting metadata for itself is great. And it's a huge improvement from what we've been able to do over the last couple of years. So if we can do it from many systems, that's amazing, but it's not enough. What we need to do is to apply algorithm of machine learning to make sense out of it. So we need to be able to look at this metadata and analyze it so it can help us ask, answer different questions very quickly. So for example, if I want to track my data, I want to be able to see it in a lineage format. And if I want to locate a specific formula or calculation, or for PII reasons, specific data item, I want to be able to search it. So that analysis is critical and this is what the new innovation is all about. And lastly, and I think it's a full switch together and at least, but not the most important, is the ability to take the same analyzed metadata and push it into different systems to address business users' needs. So sometimes I might want to see the same analyzed metadata within Octopi, for example, and sometimes I want to see it within my data catalog because my governance people need to be able to see it for audit capabilities. So if you automate all of these three capabilities, the extraction and synchronization of automated data to get you visibility, the analysis of it to make sense of it, and the ability to inject it in different perspectives, that will get you a long way. And if we move to the next slide, please. Bob is our, is the controller of the presentation. We decided in the pre-session that he is the more responsible person for that. So that's what we went for. But anyway, coming back to what happens when you automate these, you get one source of truth to your business that is automatically updated and perhaps the price that it could take you to it manually, which is what we are currently investing today. And this is, I think, very straightforward. If we can do that, that's the truth. Hey, Gail, I'm sorry to interrupt, but you just got very, very quiet again. Oh, sorry. There we go. That better. Yeah, much better. Thank you. Okay. Sorry. It's good. That's the exciting part. But, okay. So the other part of that, and of course, in our projects, getting all the business terms automatically or getting the lineage automatically or getting the mapping automatically. The other side is I think this presents a unique opportunity for data governance to be an enabler for the business, because a lot of the time, data governance is looked at as something that is holding back the business. It tried to make everything in order. It tries to make everything documented and that takes time. But if you look, for example, at Self-Service BI, I think this could be an opportunity because Self-Service BI could be a nightmare for data governance. You know, it's a decentralized architecture, too many pools of information, too many users, very hard to govern. But if we automate the entire process of gaining visibility to all these various sources, then suddenly we become the enablers. We become the brakes that allow the car to drive at 180 miles per hour, because it's safe. We have the governance in place automatically. And that is, I think, where us as a data governance community should inspire to. We need to be looked at as the enabler for the business and this is a great way to do it. So with that, Bob, the stage is yours. Okay, thank you very much. And so much of what you said is so important, even thinking back to the days when I was getting started in the data management industry, you know, automating, getting the metadata into the tool, then spending the time, we were calling it rationalizing the metadata and linking the metadata together took so much of my time and we needed to validate that and vet that through the business community. And if we can speed up that process and we can really jump forward and not only collect the metadata but have the tool use its smarts that are built into the tool to analyze that metadata and to make sense to it and make it easily presentable and understandable to people, it becomes extremely valuable to not only the metadata repository administrator, but the data governance administrator as well. And in this day and age, it seems as though a lot of organizations are looking to stay as lean as possible. They may not necessarily have a data governance administrator and a metadata repository administrator. It might need to be just one person or part of a person's time. So anywhere where you can automate the metadata in the process becomes very important. It actually buys back significant time for the data governance administrator to be successful in getting people engaged in data governance. There's a lot of things that a data governance administrator does if we can make the metadata process as simple for them as possible. Then that's certainly something that we want to be doing in most organizations or in most organizations are looking to do. So let's spend a couple of minutes talking about what we should look for in a metadata management tool and how that's gonna help us to be successful with our governance initiative. So the first thing is that I always break the use of metadata or should I say the use of data or the actions that people can take with data down into three categories. I talk about the definition of the data, the production of the data and the usage of the data. Well, let's think about that in terms of metadata. We need somewhere in our organization, in our toolkit to define the metadata that's most important to us that we need to collect, that we need to automate. We need to make certain that we have automated processes or at least seamless processes for producing the metadata. And certainly again, pushing the metadata out to the tools where people are gonna be able to get the most use of that information is also very important. And so what we wanna make certain that we're doing is we're providing efficient metadata analysis and navigation. And that's something I guess I consider it to be kind of the secret sauce of the Octopi tool is where they can take the metadata that's collected into the tool and represent it in such a way that makes it easy to navigate and it brings more value to your organization. So we wanna make sure that we're not only automating the collection of the metadata but the tying together of the metadata and the presentation of that metadata to people in the organization that are gonna use it to really improve the return on investment that you're getting from your BI space, your data warehouse, your master data, your data lakes, your data analytical platforms, really anywhere that your organization is applying resources, you wanna make certain that we're gonna get the most value out of those resources. And what is it that people are gonna look at to make sure that they have the confidence in the data, they're gonna look at the metadata and they're gonna look at the metadata management tool and that's gonna help them to again to get more value out of the information that they're using to perform their job function, to make critical decisions for the organization. So even going back to the days when I was at a metadata repository administrator, and I had conversations with people who told me, well, we don't have any metadata in our organization. And this was in the day when data warehouses which are still very popular or were very popular back then as well and I would ask them questions about, well, do you have a data warehouse? And they'd say, certainly. I would say, are those, is the data in your data warehouse being stored in a either relational database management tool or some other type of tool? They would say, yes, that's where it's stored. Well, I would point out to people that the catalog behind the database tool is a metadata repository specific to the metadata that's necessary to operate that specific product. So in the databases, if you're using database tools that are prevalent on the market, there is metadata. It's built within the catalogs that you have. And the same thing is also true for your ETL tools. Not only are your ETL tools moving the data from its native source or from wherever you're sourcing it from to the target destination, but there's a lot of information that's inherent in that ETL tool, not only the lineage, but the calculations and the derivations and what's happened to that data as it's moved from its native source to the point where people are using that information. Certainly in the data warehouse environment and the MDM platforms and any other tools that you have in your environment, data modeling tools are a perfect example. All of the information that's being collected in those tools is metadata. And if we can, again, automate that process to get that and to manage that effectively within a tool, it's gonna become very valuable to the business end user. It'll help them to be able to navigate through the tool. And one of the things that Gal mentioned is that organizations tend to have a lot of different reporting and analysis tools. You could have, back in the olden days, it was business objects and cognos. These days it's Tableau and QuickView and all these other tools that are on the market. Well, the fact is that the information that's being housed in those tools are metadata as well. If we want to have a catalog of what reports we have and when they're published and who receives them, a lot of that metadata is inherent within those reporting tools. And we should do whatever we can as practitioners in the data management and the metadata management space to take advantage of that metadata that's being collected within those tools. So how will a great tool lead to increased use of metadata? Well, the first thing is that it's gonna knock down the amount of resources that you're gonna require in order to consistently manage your tool. You're not gonna necessarily have people that are kicking off jobs to pull metadata into your tool. You're gonna also, so when you automate these things, you're gonna decrease the need for resources to spend time and to physically have that accountability to take some type of action to make the metadata appear in this product. And the fact is we, again, we want to kind of decrease that as much as possible. So we wanna automate as much as possible all the processes of bringing in and making sense of the metadata that we're collecting within our metadata management tool. And so oftentimes automation results in accurate metadata collection. So the tools and the bridges and the buses that are used to pull the metadata out of these tools and into the centralized repository become extremely important. And so we, again, we wanna make certain that the things that we're doing to access that information, to access that information about the data that's gonna make the data more valuable to you, that we provide whatever level of automation we can. I know from my experience that if somebody forgot to kick off a certain job, we would not have all the information that we needed to have in the repository. And it's a really wonderful thing when you go to somebody in your organization and you tell them that you have their metadata or the metadata about what they've been working on in the tool and they challenge you to say, well, is it up to date? We just made a change in the last week. Is that gonna be reflected in the repository? Well, again, if you have automation in place, it doesn't require any actions from anybody to get the metadata into the tool. And as I mentioned before, that secret sauce is that automated metadata analysis, which helps to fit pieces together. It's the rationalization of the metadata with other metadata in your environment. So all of these things when you add it up, if you have these types of capabilities in your metadata tool set, you're gonna get a lot of value out of your metadata tool and people in the organization are gonna find that it's very valuable to them. And it's gonna help them to perform their job function and get the return out of the investments that the organization is making. And with that, I'm gonna kick it back to Gal for a couple of minutes here, where she's gonna talk a little bit more about the OctiPy Automated Metadata Management. Thanks, Bob. And thank you for the interesting last few slides. It definitely relates to everything that you said. But I wanna do now is take a bit into practice what we've just discussed in the past 30 minutes. And to do so, I wanna show you how a typical customer in the marketplace looks like. So we usually work in an environment here, now that on the left-hand side, we have the various source systems, the ColonyRPs, CRMs, Finance. As we are working for the BI, we usually start our journey looking at ETL tools, which most of our customers have, of course, multi-vendor architecture, the data warehouse that have the databases could be Oracle or SQL or any other databases. And the reporting tools, all the way to the reporting tools. And what OctiPy does, coming back to what we talked about a few minutes ago, is actually automating the three processes that are critical to get the most out of your metadata. The first thing that we do is we automatically extract metadata from all these different buckets that you're seeing, from the ETL tools, the data warehouse databases, and from the reporting tools. All of that is done automatically across multiple vendors because, to be honest, visibility should not be dependent on the vendor. You need to have it agnostic to that. So that is the first thing. And we centralized the metadata. Then we analyzed the metadata to give you the ability to ask different questions and get quick answers. So one of the results of this analysis will be a cross-platform lineage of how a field or a column is moving and what's happening to it across all the ETLs all the way to the reporting tools and vice versa. And we're also giving you data discovery capabilities, which are relevant, which is another way to look at the same metadata. And lastly, what you won't see here in the demo, but I think it's also important, we talked about how critical it is to be able to push this analyzed metadata into different systems. So today in the demo, we're gonna see how this analyzed metadata is seen within the OctiPy system. But you can take the same analyzed metadata and push it to your data catalog or use APIs to push it to get it inside any other governance tool that you want to think about. Bob, if you please press on the next slide. Thank you. So one of the slides before we jump into the demo, what you're saying here is the average or the most common use cases that our customer are using this type of technology we just talked about. And we're seeing that there are data governance use cases and for example, lineage for audit, something that has become very critical in the last few years. And they'd like to locate quickly your data using data discovery capabilities. But also for business intelligence operation, things like finding problems or issues with data quality on the report usually will result to looking at the same lineage. And what I like about it is actually data governance use cases are a lot of the time to set the other side of the same coin as for example, the business intelligence use cases. There are just a bunch of them in this demo. So I think I have control over the screen or not yet. There we go. And if you could just make sure that you're eating that mic. You're kind of pinging it out again. Yeah, I'm sorry. I went over all the Q and A, all the chat and I understand that I'm not, they inherit so well. So I apologize for that. And I promise everyone that will want to have the same session one-on-one. I will be happy to do so without Bob of course, but he'll be very happy to do that. Just shout it out gal. Okay, cool. So let's move to our demo. Should all be able to see three circles on your screen. That's correct? Can you see that? Perfect, so perfect. So this is Octopi main screen. And what you're seeing here is actually a representation or a crunching down of a lot of metadata we've fed into this environment from various systems. So just to give you a notion of what we have here, you can see metadata and this is what we are gonna see different use cases coming down and being crunched down, extracted, analyzed and then pushed into the system from all of these various sources. So multiple ETL systems, multiple databases and multiple reporting systems. And what we can see here is that at a first glance, which a lot of the time is also helpful for itself, is that we have almost 410 different ETL tools. Some of them are Informatica, some of them are SSIS, some of them are data stage and some of them are stored procedures that are moving data from one side to the other. We have almost 2,000 different tables and views in Oracle and SQL and 26 different reports across five different reporting tools. So Quilic, Power BI, BO, Cogniz, SNRS, the traditional one and the newest one. Obviously, this is a complicated environment as it is even though this is not the biggest environment that we've been working with. And I wanna start our first use case around for audits. So a lot of the time, mainly in the last few years, auditors are requesting us to provide them lineage to be able to see how the data that we were working for was created and what transformation happened along the way. Interestingly enough, talking about what we've discussed before, this is the exact same scenario for when you're trying to solve a data quality issue with your reports because the first thing you wanna be able to understand is how data got to this report. So for that reason, let's choose ourself a report that's called customer product report. And I can search Octopi for that and I can see that there is three reports that has the name Custer and I can get some information for orientation purposes. I can see that this is the SNRS report. And the first thing that I can do is get that initial lineage that is taken here about four seconds to generate. And within this period of time, what I can see is the full end-to-end documentation or lineage of what's happening in my environment for real. That is, oh, sorry. And we have here the reporting tool on our right and we can see immediately that it's based on this view. We have the view name and watch the database and it resides on. This table is actually, this view, sorry, is actually based on these three different tables. And these three different tables are actually populated with information from four different ATL tools. And just to highlight the complexity, if we look at the types of these ETL processes, so this one, for example, is Informatica. This is a nice to say yes. This is Informatica as well and this is data stage. So to get that visibility, to get that mapping in order to address one of the two use cases we were talking about, something can take hours, something can take days, just because there's so many systems I need to look through and sometimes even eliminate systems that I don't know that are not relevant. But the next thing that I'm able to do is, let's say again, that I wanna be able to track and understand what's happening at the field level. So from here, this is my initial lineage or overview lineage but it's something very powerful for itself. But sometimes I wanna dive deeper. So what I can do is I can deep dive into what's happening within my ETL itself. And here I'm seeing the package and the sequencing between the different dataflow tasks that's interesting. But what you're seeing here is instant documentation. And here it took five seconds to create at the field level. We have the table as the target table. We have in blue all the different transformations that the different fields are going through in the table. And in red, we have the target table. So in a minute, I know everything that is happening to each one of these fields. And I can highlight the field and I can track it. And by the way, Octopi, for example, generates unique URLs that you can share. So this becomes one source of knowledge for everyone to be able to track the same column, the same high level lineage and together understand what they're seeing. So in this case, let's say it's the product ID. I can see exactly how it's moving within this package. And the nice thing about it is maybe I wanna get broader visibility. I wanna keep investigating. My auditor wants to know specifically about the product ID field, how it's got to this ETL. I can use this type of automated capability and see all the packages in which this table is actually the target table and not the source table. I can drill down and get the same accurate visibility in a matter of seconds. Now, this allows me to look at the field level of what's happening here. But there's also the ability to deep dive into what's happening within the reporting system. And I'm gonna show you that in a second. Another important thing, especially in governance project, is I wanna be able to confirm that if I do any changes, nothing of important is gonna be impacted or I know exactly before that what is gonna be impacted. So Octopi, for example, will allow me to do lineage ETL, which means show me, instead of starting from the report, show me everything that's gonna be impacted along the road from this ETL. And I can see now that I get a very different map that, again, manually would have taken me probably a long time to create of four different tables that are being fed information from this ETL process, two different views, one tabular model and seven reports that are being affected by this ETL process. Now, coming back to what we were talking about, the fact that I might wanna see the field level visibility in the report itself, what I can do, I can just press the button here and ask for report view. And here, I get the documentation of what's happening within the reporting tool itself. Talking again about our business glossary story, this is critical. If I can immediately get this for all of my reports across all of my reporting systems, I can see that what's happening in the physical layer, I can see what's happening in the semantic layer and I can see what the business user will see. And for example, here, for the full name field, this is actually concatenated within the semantic layer and at the physical layer at the database, it's actually two different columns. And if I had to know that manually and track this manually all the way back to my ETL, it would have caused me a lot of troubles. So collecting all this metadata and then analyzing it to provide me these different depths of visibility, depending on the context that I'm, or the question that I'm trying to answer is very important. Now I wanna show you a completely different way to look at the same metadata that we've extracted and analyze, which is the discovery tool. The discovery tool is in a way your central repository or your inventory, immediate inventory with a very powerful search engine. And what it allows me to do is to see across my entire state in a centralized location what I have in my environment. So for example, if you remember, we were talking about the product ID field. Of course, Miss Bell in front of 700 people that's always the thing to have. But so let's say it's for the product and then let's search for product. And Octopi is showing me immediately no need to go into each and every tool, no need to go into each and every structures within the tool is that across my environment, in my ETLs on the left, in my databases in red, and in my reporting tools, these are all the objects that has anything to do with this field or column that I need to track. So for example, if I wanna see all the table, field sources that I have around product, I can get that immediately. And that is actually a springboard for every governance project that starts with mapping specific fields, whether it's for changing calculation, whether it's for new PII regulations and I need to find a specific field across my environment. And obviously like everything in our world you can always expert to Excel and get that information and share it among yourself and this becomes your collaborative platform. And it is basically, I do wanna highlight by the way that what our customers are sharing with us is that the biggest time saver that we actually are seeing is that all the gray boxes because all the green boxes tell me where to search but you guys probably know it way better than I do that when you start mapping you don't know where not to search. So if you have a place that tells you where not to waste your time, that in itself will save you many, many hours in doing this manual mapping and get you started way faster. I think with that we'll go back to... Okay, so we left off here with the use cases and as I said in the part of where I was speaking and kind of that secret sauce that thing that's gonna save you weeks of time honestly, if not only will it save you the time it'll make certain that it's making all of those connections for you. So that secret sauce, that analysis of the metadata that brings it together in the way that Gal just showed in the demo of being able to link and navigate from one piece to the next is extremely important. So we talk about automation from the point of collecting the metadata but certainly analyzing the metadata and putting it into a way that's gonna be beneficial to the business community is really one of the core competencies of this product and it really does make it easy for people to be able to navigate through the metadata in all the ways that Gal just mentioned. So the last thing that I wanted to talk about before we turn this over to Shannon for some Q&A looks like there's been a bunch of Q&A. So before we turn it over to her we wanted to spend a little bit of time talking about the different people in the organization that are gonna benefit from having improved metadata but really having improved metadata automation and delivery. And oftentimes I break it down into different roles and I talk about the data definers as being one of the types of data stewards in our organization. We have data definers, data producers, data users but the data definers and the people that are responsible for the integration initiatives and the projects that are taking place within your organization. They play a key role in collecting some of the semantic metadata that's really used to help to get the business understanding of the data not just where it came from but the definitions and the business rules and those the protection rules and those types of things. So the data definers are certainly one of the groups of people that will benefit from having improved metadata and improved automation and delivery of the metadata. The data stewards themselves and I'm known to say that potentially everybody in the organization that either defines, produces and or uses data is a steward if they're being held formally accountable for that relationship to the data. And so as you can see by the things that Gao showed you the data stewards the people that are producing the data producing the metadata the people that are using the data and using the metadata are gonna be certainly an audience that are gonna benefit improved metadata within the organization. And so we talk about the users the end user community. Well, how are we expecting them to get the most valuable out of the data that they can if they don't have this information? If they can't see where it came from how things changed what the definitions are and all of those things that traditional metadata tools handle. So the daily users they're one of the prime candidates to be people that will benefit from improved metadata within the organization. And then the data managers any of the people that own the data or as I refer to them often as domain stewards subject matter stewards or subject matter experts they're certainly gonna benefit from having improved metadata at their disposal as well. And certainly the automation and the delivery of the metadata to those folks we're gonna become a primary concern for whoever's administrating or administering data governance and metadata within the organization. So, we've touched on a whole bunch of different subjects what we'd like to do now is toss it back to Shannon to see if there's any questions for today and we'll be glad to answer this questions for you. Bob and Gal thank you so much for this great presentation it's been just fantastic we got a lot of great questions coming in as you mentioned and just to answer the most commonly asked questions just a reminder I will be sending a follow-up email by end of day Thursday with links to the slides links to the recording and links to anything else requested throughout the webinar. Now, just starting off here how do you estimate the cost of metadata? What's the ROI on that? Okay I'll go ahead. Yeah, cool. So I think that there's actually two factors you need to take into consideration. The one is how much of the manual project will cost you? How much, how is complicated is your environment? How many metadata sources you need to get access and visibility to? And also on top of that how will the maintenance of that metadata collection and analysis manually is gonna cost you? I can share on a very high level we're talking with a lot of our technology partners they're saying that in most of the projects that they're working the cost of the licenses is usually a third of the cost of the personal services that are needed to the implementation just for the part that we're talking about getting that visibility of what the environment has and documented initially. So look at that cost on one hand and then depending on the tool that you're looking for look at the cost of that tool. I can tell you for example that the pricing model for Octopi is around $500 per metadata source per month. So depending on how many systems you have you can look at that and if at the end of the year that project is cheaper than the one we've talked about that includes the manual work then I think it's an easy decision. But it depends again on these two factors that you need to compare. I hope that answers the question. And yeah, I just wanted to add one thing to that So if you're looking at how many different tools how many different sources of metadata there are in the environment that's going to certainly help you to estimate out the cost of getting started and getting moving. So you may not end out automating the metadata collection from all of your tools at once you might do it in a very structured way and you could structure your costs associated with that depending on what tools you're looking to automate the metadata collection from. So I'm with you Gal and your answer I thought that that was spot on. Thanks. Gal is really specifically for you how does Octopi discover data definitions and are there any limitations for doing this across cloud environments? So great question, thank you for that. So part of, so the first answer to your question the first part of the answer for your question is that we extract because we are able to extract metadata from all of your reporting tools or from many reporting tools including those that are on the cloud we know how to pull the business terms from all the reports centralize it apply logic to it by the way we also know how to automatically connect it to the physical layer that the report is based on or the business service is based on. And again it's the same process that we talked about extract the metadata from the relevant sources which is the reporting tools analyze it and then make it accessible. And once you have that, that's what we do and we can definitely do it on systems that are on the cloud like Power BI and click and to blow and so on and so forth. What is recommended best practices for metadata management for unstructured data? Thank Bob that's in your turn. Well certainly so oftentimes the tools that are being used to content management systems or records management systems and places that are oftentimes used even things like SharePoint that are used to be able to collect your unstructured data. There's metadata in those tools as well and bringing that metadata but the metadata may be different. I mean you may actually from a person perspective you may be talking about authors and approvers and publishers of that information. So certainly the unstructured data and if you read any reports that are talking about the growth of data unstructured data is growing and growing and growing and data governance, information governance, records management, those disciplines all seem to be coming together. So yes it's very important to make certain that you're collecting the metadata about your unstructured data and incorporating it into or giving people the ability to navigate from that into the structured data that you have in your organization. So that's a great question and unstructured data is becoming more and more important to the business community every day, every week, every month. Can you do a data classification for a CCPA requirements? Well, go ahead. Yeah, so one of the things that we allow you to do is we won't be able to help you with classifying everything around CCPA that what we will be able to do is exactly that jump start that we were talking about because we will be able to collect metadata from across all of your environment and give you visibility and then allow you through tools like the discovery tool that you've asked to search anything that is related to CCPA. So if you would need for a CCPA to search for a personal name across your environment how would you do it without a manual without an automated metadata management platform? You'll have to go manually to all of your systems and search for it but here in a centralized location like we've seen in the discovery you get the ability to ask the question and get the answer centralized immediately or if in CCPA you need to search for maybe not CCPA but maybe credit card numbers in other standards. The same thing, search for the field name, find all the places, then decide what you wanna do it. You're starting at the point that you can focus on the core governance project which is applying the logic, applying the stewardship, applying the standard instead of wasting a lot of resources in the manual mapping of finding these fields and then getting started on your project. And I would just add one thing to that. I'm working with an organization right now but it's doing this very manually. So maybe we need to talk about that but very manually they're doing their data management inventory and they're assigning then the different classifications. So for the CCPA, the classifications that are required could then be added to the inventory but this again is speeding up the process of getting that inventory recognized within the tool. So I find that it can really help in search functions that Gal talked about are really helpful when you're looking for things like customer information, you're looking for things like addresses and phone numbers and social security numbers and those types of things. Just think about the time that it would take for you to navigate through your metadata resources to find those things and think about the time that a tool like this will save you in locating those things so that you can then classify them appropriately. And we've got another product question here for you Gal. If we feed our ETL scripts into Octify, does it create the lineage map? If, sorry, can you repeat that again if we feed? Yeah, if we feed our ETL scripts into Octify, does it create the lineage map? Yes, absolutely. So we know how to take all, we have the, and you can by the way enter our website, go to the product section and see all the systems we support but we know how to parse them whether it's in the more structured ETL like Informatica or DataStage or SSIS and also scripts like stored procedures, all of them we create the same lineage that I've shown you, both of the high level lineage that connects it to the reporting tools and the databases and also in the depth lineage at the field level for each and every one of those processes. And while we're on the topic on lineage, on a field level lineage, can we see transformation roles in your extract SQL data type conversions, join keys, et cetera? Yes, so great question, thank you for that. I didn't show you this in the demo but one of the things you are going to be able to see if you remember we were looking at the line tracking what's happening at the field level and Octify you'll be able to see by hovering over, to be able to see the expressions that are happening there. And if not the expressions, what you've already seen is that the representation of the transformation in a visual manner. So definitely you can see that and you can also use the discovery tool in this context as a complementary capability and look at specific fields and then at the centralized side see all the SQLs in the object parts of the discovery and see all the SQL scripts that are associated with this object and see what's happening and see them visualized, sorry for that. No, this is great and if the tool misses auto lineage or if the auto lineage misses something can we manually add in lineage? So it's something that is coming up right around the corner, not available yet but this is something that we are very focused on investing a lot of resources to make sure that that doesn't happen because that's the main, I think part or the main value proposition around the analysis that we're very focused on but that's something that we are certainly discussing. We are allowing you to also expert using API those type of capabilities into other tools and in some of them you can edit especially if it's data catalogs tools. And does Actify provide a business class three module to tie into this? So another great question. Yes, this is something that we have just recently launched and one of the things that you will be able to see within Actify is another model right next to the model that you've seen that contains all the business class, all the business terms that we've detected across all the reporting and all the reports and all the reporting tools and in there, by the way, you will be able to be more active, you will be able to insert business definitions, you will be able to insert connections. And by the way, if in some of the tools, business definitions are pre-existing like in BO sometimes that's a practice that's happened across some organizations, we automatically draw them and allow you to see that within Actify. I love it. And I think we have time to slip in one more question here. Does this show you what calculations or logic is being applied to the field at every stage? Oh, sorry, I didn't hear that again. If you can please repeat. Sure. Does this show you what calculations or logic is being applied to the field at every stage? Yes, so combined with the discovery, you can see that if you remember the discovery had a section of the ETLs at the section of the databases and the section of the reporting tools. So combined with the discovery, you can see all the different calculation across all of them. In the lineage part, you'll be able to deep dive separately within the ETLs and the reporting tools and the database each one in its own window. But in the discovery, you will be able to see it in a centralized location. All right, well, Gal and Bob, thank you so much. But I'm afraid that does bring us to the top of the hour. I want to thank all of our attendees for being so engaged in everything we do. We love all the conversation going on throughout. Just a reminder, I will send a follow-up email with links to the slides and links to the recording of this session by end of day Thursday. And if you want to continue the conversation, you can go to community.databrasea.net. Hope you all have a great day. Thanks to OctiPie for sponsoring. Gal and Bob, again, thank you so much. Thanks, everybody. Thanks, OctiPie. Thanks. Guys, it's been a pleasure and feel free to reach out to me directly as well. Speak soon. Thank you.