 Hello and welcome. My name is Shannon Kemp and I'm the Chief Digital Manager of DataVersity. We would like to thank you for joining this DataVersity webinar. You can't have best-in-class governance without best-in-class data lineage. Sponsored today by Octopi. Just a couple of points to get us started. Due to the large number of people that attend these sessions, you will be muted during the webinar. For questions, we will be collecting them by the Q&A in the bottom right-hand corner of your screen. Or if you'd like to tweet, we encourage you to share highlights or questions via Twitter using hashtag DataVersity. And if you'd like to chat with us or with each other, we certainly encourage you to do so. And just to note the Zoom chat defaults have extended just the panelists, but you may absolutely change that to network with everyone. And to access the Q&A or the chat panels, you can find those icons in the bottom middle of your screen for those features. And as always, we will send a follow-up email within two business days containing links to the slides, the recording of the session, and any additional information requested throughout the webinar. Now let me introduce to you our speakers for today, David Bitten and Anil Remeshwar. David has over 20 years of experience working with technology companies and a solid history of global leadership across leadership success in business-to-business enterprise sales, specifically software as a service. During the last four years, he has led sales and business development efforts at Octopi, where he enjoys helping VI and analytics professionals harness the power of automated data lineage and discover to achieve full control of their data. Anil is an accomplished big data analyst, data developer, database developer, and software engineer with 15 years experience processing multiple petabyte data sets and a skilled in finding the subtle nuances of data that make the difference between day-to-day metrics and valuable business insights. And with that, I will give the floor to David and Anil to get today's webinar started. Hello and welcome. Thank you very much, Shannon. I'm super excited today to be able to host this together with our good friend Anil and Zigo. And what I'd like to do is jump straight in to the presentation where I'll have Anil share some of their challenges and how they use Octopi to address them. So Anil, would you like to introduce yourself? I think Shannon already did so, but maybe you'd like to talk about the company that you work for and the existing data environment and so on. Sure. Thanks for having me. My name is Anil Remeshwar. I am the data architect at Zigo. I started at Zigo in February of 2020 and at that point what I came into was a software stack with a BI solution that was built on Microsoft SQL server and there was no lineage whatsoever and there was a decided lack of trust in the data. So maybe we can go to the next slide here, David. Sure. Excellent. So yeah, there was multiple reports being consumed by multiple business units and this had evolved over time without any governance or oversight. So there was no data governance team. It was developed by third party consultants. And what happened is as the data got into more and more hands, the trust in the data began to erode. So by the time I had arrived, nobody really trusted what they found in the data warehouse. The metrics were in conflict with queries against source systems. So I was tasked with kind of rebuilding Zigo's entire data platform from the ground up and part of that endeavor included providing end-to-end data lineage solution so that we could rebuild trust in the data and to make effective business decisions. Go to the next slide, please. Sure. So just with these challenges in mind, you embarked on your team's initiatives, correct? Correct. Yeah. Okay. Do you want to share? Can you share that a little bit about that with us? Sure. Well, the initiatives were to primarily to regain trust in the data so we could make effective business decisions. Or are you talking about the actual mechanics behind it? Sorry, David. That's okay. Whatever you'd like to share with us, basically the slide here, the data engineering initiative. Sure. So we had four core applications that were disparate. The legacy data warehouse only included one of those four applications. So folks were frequently trying to pull data from application B and combing it with data from application A. And they were getting the wrong results. So we determined that our best course of action was to build a data lake in Snowflake. From the data lake, we developed a conformed layer. And then on top of that conformed layer, we developed the reporting layer. This is where we decided we needed a data lineage solution. Some of the challenges that we faced with the legacy data warehouse were fundamental related to that source code, which was buried deep within stored procedures and SSIS packages. So we found multiple problems. For example, stale dimension data where a lookup table hadn't been updated, changes in the source system behavior where the LDW code remains static, transformations that changed over time without history tracking. In other words, the metrics would change over time. And then one of the scariest pieces that I had to tackle was the derived columns with ambiguous names and unknown definitions. So we decided to rebuild all of this in Snowflake and tackle the majority of those problems with the conformed layer. The data pipeline, oh, yeah, there we go. It's the next slide. I think that was one slide ahead of you. I apologize. Oh, no worries. Sure. So this slide, yeah, sure. Yeah. So, yeah, I think I've kind of gone over all of these. But the biggest thing was we didn't have a single source of truth. The other challenge that we faced was the current system with the legacy data warehouse, it only served the financial department. But then it began, as it began to be used by the customer success department and our tech ops teams, they were getting the wrong data. So that's when we determined we needed octopi and a data lineage solution so that we could accurately track data from source system all the way to end consumers. Sure. Okay. And so this effort could have been reduced to a few hours with the data lineage solutions in place. Is that what you're saying? Well, one of the big challenges is a report consumer would get a report and they would compare the results to what they saw in the source system. And then it would take hours or weeks to actually track down what had changed, what was the variance between what they were seeing the legacy data warehouse report and what they were seeing in the source system. Okay. All right. So up until now, there was basically no management tracking or visibility in where sensitive data exists or beyond the source systems and now it was consumed. So Zigo is now, as I understand, applying appropriate data masking and governance policies in order to ensure that the data is protected from source systems through all the different endpoints. Can you maybe elaborate on that? Yes. Thanks for reminding me of that. So one of the other challenges I found with the legacy data warehouse is sensitive data, thankfully not PCI, but PII and other sensitive data was exposed originally for consumption by the finance department. However, as the usage grew without governance, the inappropriate data was in the hands of the inappropriate folks. And what we also found is that data was being shared directly to our clients without any governance or oversight. So we needed to put a stop to that as soon as we got a data governance policy in place. We use snowflakes tagging mechanism to tag the data and we're using Octopi now to see where the sensitive data lands in the conformed layer. And then we're applying masking policies using Octopi's data lineage to ensure that the appropriate departments don't see data that they're not privy to. Great. So thank you. Thank you for sharing that with us, Anil. So basically to summarize what you just covered up today, what you need to look for in lineage solutions in order to have the best in class governance program and data lineage is actually the crucial part of data governance since it provides the records of data movement. And the top three features that you need to look for in a good data lineage solution will be, of course, to provide the best support to your governance program. And those are coverage. So we want to see, of course, as many systems that's possible covered, such as ETLs, data warehouses, analysis reporting tools, visual map, which will be a quick to follow and understand. And of course, the automatic data catalog, which integrates lineage, and that will allow access to assets with the capability to track their lineage at any given time. So what I'm going to do now is jump into the demo portion of tonight's webinar. Sure. So give me one moment. You should be able to see the Octopi environment, demo environment in front of us today. So thanks once again, Anil. I'm going to now jump into the demo and show some of the attendees here today how Octopi can actually provide the best in class lineage for their data governance initiative. So what we have here on the screen is the Octopi demo environment. On the left hand side, what we see here, just going to go through what we have on the screen, and then we'll jump into a demo using a few different use cases. On the left hand side, these are basically different modules within Octopi showing you as far as we understand the best in class data lineage, because we actually have three layers of data lineage XD, which is cross system lineage, inner system lineage, and end-to-end column lineage. And together with that, we also have the discovery space, and I'll explain to you why that's important just in a few moments as well. All right. So to further explain what we have on the screen, on the left hand side, these in our demo environment, we have 398 different ETLs from various different systems that we can see here on the screen. In the middle, we have roughly all exactly 3,247 DB objects, including tables and views, for example, from these various systems that we see here. And that's basically a sampling of the types of technologies that Octopi is able to out of the box, extract or automatically extract the metadata from. And then here we see here on the right hand side are the BI tools or the reporting tools and the 23 different reports. So what I'd like to do is show you the power of Octopi with reference to a few different use cases. As I mentioned, we'll touch on the very high level, the various areas within Octopi. And from there, of course, you'll be able to see how that's important to understand at the very granular level, the data lineage of your data environment to support a best in class data governance platform initiative. So the way I'll do that is through a use case. The first one that I will go through is going to be the most common one that we see amongst our customers or that our customers are telling us is the most common one amongst their organizations or their data environments. And that is you have an error in a report. I'm sure we're all very familiar with that. And so in general, in most organizations today, there's an error in a report, if there's an error in a report, the way that's handled is probably very similar to this. Let's say, Mr. or Mrs. CFO looking at a report, let's say it's the end of the quarter. And of course, they're stressed in order to be able to support or to provide the quarterly earnings. Let's say there's something wrong with that report. Of course, they're going to open up a support ticket, the appropriate team is going to now need to look into that to try to understand what went wrong with that report. And most likely, they'll need to go through a process which is very similar in most organizations. That is, you'll start off by probably taking a look at the map of the systems and then probably taking a look at the tables and views that were involved in the creation of that report to probably look into maybe the glossary to see if the labels were given the same names and if not, which glossary, sorry, was used. After that, the error may not be in the database area, then probably look into the ETL. Looking into that, of course, all of this will be done manually. Most likely, it will involve multiple people within different teams within the organization that have different responsibilities within their domain. So what I'm getting at is basically a lot of people, a lot of time, may not be, of course, even a hundred percent. So it's basically not efficient. Now imagine, and of course, that would take literally most likely in most organizations anywhere from hours, days, weeks, even months if at the most extreme case. Now, what I'd like to do is show you that same scenario and giving you an example of the lineage capabilities within Octopi and show you how that would be done literally automatically in a few seconds. So let's imagine for a moment that the issue that we're having with is in a report called Customer Products. I'm going to come into the Octopi's lineage space, type in the name of the report that we're having trouble with. And right now, we're going to go into the first level of lineage that Octopi provides. Remember, I mentioned three layers of lineage, Octopi lineage equity, cross system, inner system, and then end to end column to column. So right now at the high level, what we need to understand is how that report was created. I'm going to, as I typed it in, you see here that Octopi's filtered through all the metadata and shown me the report we're having trouble with. If I click on cross system lineage in about a second, I now understand how that report was created. In most organizations, just to get to this very high level of understanding may take hours or days. As you can see here at the click of a mouse, we now have that understanding here on the screen. On the right hand side, what we see here is the report that we're having trouble with. As we move to the left, we can start to see how that report was created. And we see here that there are two views that were involved in the creation of that report. If I click on any object on the screen, as you see here, you get a radial dial that comes up. And what that does is offers us more capabilities and more information. So let's say, for example, I needed to get a visualization of this view. Maybe there were many different transformations and I wanted to get a visualization, which helped me understand it. Clicking on that will show us now the source, transformation, and target. As I move to my left, I continue to try to trace how the data landed in that report. I can see here that there were also three different tables that were involved in the creation of that report. So similarly, if I click on a table, I can get also that similar radial dial. But in this case, let's say, for example, I'm sure you're probably familiar with being tasked making a change to a specific table, a calculation, a transformation, whatever that might be, sorry, in that table. And what we can see here is that if I click on that table, I get that familiar radial dial. What we see here on the bottom right is a six with an arrow to the right. So that means that there are actually target objects or objects that are dependent on that table. So if I were to make changes to that table, I can most likely be sure that some, if not all of these different objects that have now popped up will have been affected, including an additional stored procedure, a tabular table, measure groups, and these four different reports or three different reports. As we continue to move to our, and I can imagine how long that would have taken, of course, if you had to do that manually or in another situation where you're not using Octopi. In any case, as we move to the left, we start to see here, we come to the ETLs that were involved in that report, not one ETL, but multiple different ETLs were involved in the creation of that report. And the reason why I'm pointing that out is that many organizations are using many different systems because systems come along and they are integrated, maybe there's a merger and acquisition, maybe there's a legacy system that you just haven't put to rest and even introduce new technologies. So you're probably using many different systems to manage and move your data. And that's not a challenge for Octopi. As you can see here, we can still show you the path that that data has taken in order to land on that report. A couple more things that I wanted to point out before I continue on, you may have noticed, I know that it's probably small on your screen, but there is a shadow to the right of this object over here, this table, there is a shadow to the left of this ETL and there's a shadow all the way around this ETL over here. What that's telling you is basically there are dependent objects where there are this object is actually sourcing from other objects, and then you can actually continue to decipher or unravel the lineage by just clicking on this case, for example, the eight to the left will show us now the lineage or the other objects that this ETL is sourcing from and which is basically these eight different tables. So to continue on with our scenario, we asked our customer, this one here, that was having troubles with this report, if they had any idea what went wrong with that report and they admitted that a few weeks earlier before they started using Octopi that had made changes to this one ETL over here, and most likely when they make changes, they usually run into or encounter production issues, which is a common scenario in most organizations. Now we asked them if they were going to be making those changes and they knew that they were going to be encountering production issues, why not be proactive? Why not look into what will be affected, make the appropriate corrections and save everybody the hassles of the production issues, save the data quality issues that then result with all those production issues, save or increase the confidence in the data or the trust in the data as Anil was speaking about before because if the data is not seen as solid, then of course the trust goes up. Now of course, as we all know, that's a lot harder said, a lot easier said than done because in most organizations, in order to do that, it means looking into many, many different objects, many different ETLs, tables and views, reports, and so on. It could be literally thousands or even hundreds of thousands, so to try to be proactive really is just almost impossible. So most organizations work in that reactive measure in their active way, of course, trying to avoid production issues whenever there are changes done. And then if there are production issues, they will address them as they become apparent. And therein lies a lot of the issues with the data quality. And of course, because you're only fixing what you know of and if you're only fixing what you know of, I'm sure you can imagine that there will be things that fall through the cracks. So now with Octopi, we've empowered you to become proactive. And so you can actually now ensure that there are little or no production issues by understanding exactly what will be a broken if you make a change. So like this customer, if you were to make a change to that wanted to make a change of the ETL, you're now empowered to understand exactly what we what will so so far, what will be of course affected. Now, before I jump into that, what we've shown you so far at the system level, which is the highest level of lineage, was a root cause analysis for a specific report. Now we're going to jump the other way. We're going to do an impact analysis. Let's say, for example, before we're going to make a change to this ETL, we jumped into the cross system lineage of that ETL. We understood exactly the lineage of that ETL in order to be prepared to make the appropriate corrections before we make those changes. So now what we see here, though, is that, for example, when we started this scenario, we were looking into this one report over here. That was the error that we're having trouble with. But now that what we see here is that when we have complete clarity and understand the lineage of this ETL, we can see that most likely when we're changes were made to this ETL, that is not going to be the only story. Most likely some that is not going to be the only error. Most likely what will happen is some, if not all of these different objects on the screen could have or would have been affected by anyone change to this ETL. So of course, these stored procedures, dimensions, tabular tables, measure groups of views, tables stored procedures and reports could have been affected. So what most likely will happen as time progresses in most organizations, these reports will start to get opened by different people at different times throughout the year. And of course, then we hope that those who are those users who are going to be opening these reports will notice the errors in them. And they will open a support ticket. I say hope because of course, as you understand, if they don't notice the errors, then it's just worse. So let's say we hope that they open the support tickets, they notice the errors in them. Those responsible now for looking into those errors have been tasked to try to figure out what the root cause is. And as you can imagine throughout the year, you're probably not stuck with only two or three or four or what do we have here, seven different reports that you have errors with them. And it's probably hundreds, if not thousands, as you can imagine. So we established earlier that most likely will take anywhere from hours, days, or even weeks to try to get to a root cause of an error. So you can multiply and extrapolate and see and understand how long or how much time is being wasted by those who are looking into that. Because of course, if they were using Octopi, they could know from the get go that that ETL is the root cause and of course put all that time and effort to better use such as migration projects, data governance initiatives, initiative data quality initiatives and so on. So now to continue on, again, what we've shown you is a root cause analysis. And then we showed you a impact analysis at the system level. Right now, what I'd like to do now is show you the next level of lineage, which is inner system lineage. So let's say we now need to actually make a change to this ETL. And we wanted to know what the impact that those changes might be at the column level. Simply clicking on now on the ETL, we're going to jump into the inner lineage, inner system lineage. And by simply, if you're using SSIs and you'll be familiar with this, I'm just really taking a 90,000 foot view and dropping down. We're going from the top all the way down into what we see here, the container. And then within the container, we can now see the data flows themselves. So if I needed to get understanding at the column level within the system itself, with this inner system lineage, I simply click on map view. And now if I click on any field, I can actually see that there is the, or actually see the journey that that field has taken from the source all the way to the target. Now, in addition to that, what we see here on the screen and the green are the source in orange are the transformations and red are the targets. Now let's say you have a transformation, you'll have a little icon on the top left over here that tells you that there is actually a transformation in there. If I double click it, I can click here and see actually the expression for that transformation. Additionally, if you have a calculation, you'll see an FX somewhere in the lineage. You can double kick that FX and you'll get that calculation. Now, of course, we can go at this level, we can go forwards and backwards within the system. So we have a complete system lineage or inner system lineage. And here I'm going to just basically give you an idea. I'm going to go backwards, taking a look at the ETL that's loading to that table itself. And what we see here is also the data flow at the column level. So for now, now that I've shown you the inner system lineage, I want to further continue on and show you the actual column to column lineage. And so finding that out is very simple. Let's say the error that you're having is with, or the issue that you're having is with unit price column, right clicking on it and clicking on end to end column lineage or actually clicking on the three dots and clicking on end to end column lineage will now show me the lineage of that column from the moment that that column enters into the landscape all the way to the reporting system. Now we can see that at the column level, we can see it at the schema level, table level, and also at the database level as well, giving you the granularity that you might need in order to understand or to help you with your day to day activities. Now further going on, now let's say you needed to get an understanding of this column. And so you want to complete the picture and you want to have an understanding of that column within you want to get a business description of that column. For example, it's tax amount that we clicked on. Now we get a business description of it. And in this case, what's a demo environment without having a little issue. So let's take a look and see here. This is the one actually that I was looking for. So here we go. We're supposed to jump in to unit price. That's one that I clicked on. And what we see here is first of all, there's a check mark on it, which tells us that that one is approved. So if you come into this description, you can now get a business description being confident that that was approved by the data owner. The automated data catalog is actually built for you automatically. It's the A and the ADC. The way we do that is by extracting the metadata and analyzing the metadata in order to create it for you. The descriptions can also be populated. But of course, the caveat to that is that those descriptions be somewhere within your environment. I won't go into all of the details within our data catalog. Of course, we can schedule another call for that. But before I go into that, I just wanted to show you one more point, which is the data discovery. And so what we were in was the enter and column to column lineage. I'm going to go back to that. I should be able to go back to that. Okay. Let's go back to that from here. All right. So we're here. We understand the column lineage. Now, let's say we now want to understand the column itself and understand everywhere it would be impact, everywhere that column is referenced and what would be impacted if I needed to make a change. And that's also completely integrated, right clicking on or clicking on it now, searching in discovery will take us to the final module that I wanted to show you. And Octopi now goes through all of the different systems that are connected to it. So you can see here the automated the ADF Informatica SSIS of the various ETLs, databases, data warehouses, analysis, reporting tools and so on, and shows you everywhere it's found unit price within your environment. So if you need to make a change, you're going to take to need to take this into consideration. It's also going to show you where it's not. So you see here in green where it is and how many times it's found it. And you've seen it in gray and where it's telling you it's not saving you. I would say just as important saving you as much time saving you that much time not enabling you not to go into that to look to look into that. I'm just going to go further to give you an idea more of what you can get the granularity of information you can get from the data discovery module. And let's say for example we see here SQL Server and it's found unit price in objects 46 times. If I click on any one of those green objects on the screen, it gives us more information such as in this case we're looking at the objects themselves. If I jump into any one of these, I can actually jump into the definition. When I click on the definition, when it pops up on the right hand side is actually the SQL that was used in order to create that definition. In this case, let's take a look at it, fingers crossed that it works, and it's showing us actually a map where a visualization of that. So finally you can see that with Octopi's automation, we can help you reduce the amount of time that you've been investing in looking for these, looking for the or trying to trace back the lineage for example. And in this case, we can see here one specific column. If you needed to make a change, I'm sure that happens very often. You now understand literally in seconds the impact about those changes would have and how much effort it would take of course in order to do that project. Shannon, that was everything that I had to share with you. Maybe you wanted to open the panel up to questions. Absolutely. And just to answer the most commonly asked question, it's just a reminder, I will send a follow up email for this webinar by end of day, Monday with links to the slides and links to the recording of the session. So diving in here, there's been a lot of questions in both the chat and the Q&A. I'll try to get to the Q&A here in a second, but I just wanted to jump into this first question that came in for you now. And I answered it in the chat, but just if you want to expand on it, did you decide to build a platform, rebuild using Snowflake etc. before you devised a data management strategy or after or during? It was in parallel. So we knew that part of the entire data architecture solution and rebuilding trust in the data would include a data management solution. In addition, when I started with Snowflake, we did not have a data governance team or a data governance office. Those were installed approximately three months after we got the Snowflake development effort started. Awesome. So David, so what kind of metadata is collected from reports? Also, do you take SQL codes, views, functions, etc. as metadata scanning tasks? Sure. So what we're extracting from the different systems, every different system is different. But for example, we're looking at tables and views. Of course, we're looking at SQL or stored procedures, etc. As long as it's technology that is supported by Octopi, we can actually out of the box go into there, extract the metadata and provide you with that lineage. There's nothing different that you need to do in order to work with Octopi. As long as it's one of these technologies that are supported here, we can connect to it out of the box, extract that metadata and provide you with that lineage that you saw here. Perfect. So Anil, how did you integrate metadata management practice and solution with Octopi's data lineage and this Octopi leverage and organization's metadata repository? So what we've done is for each of the underlying database source applications, we first imported their metadata, so just their information schema. So information schema.tables, information schema.columns. Then that's imported into the Snowflake Data Lake. Then when it gets to the conformed layer, we're actually adding in, it's part of a requirement in our deployment. You have to have a business definition that's included in JSON construct and that's what we export to Octopi into their automated data catalog. Hopefully that answers the question. Anything you want to add, David? No, that was perfectly answered. Thank you. I love it. And there's lots of questions, David, here about what Octopi connects to or doesn't connect to. Do you have a list of products that you connect to? Yeah, sure. I showed that earlier. But you can simply come to the octopi.com, supported dash technologies, and you'll see all of the technology that we support out of the box. Currently, that's what we have here and what's coming soon will be available in the next quarter or two. Additionally, to what you see here, we are in development of open APIs, which you'll be able to, in addition to the out of the box technologies, you'll be able to use those to connect to just about any other technology. So in essence, you'll be able to have full lineage, whether it's supported out of the box or not. In addition to that, we also have augmented links. And that is currently available today. And that is also for technologies that we don't support. That is a somewhat manual process for the unsupported system. But you can do that once and then it will be represented within the lineage. Okay, hopefully that answers that. Perfect. And I love it. Yeah. And I just put the link in the chat for everyone in case you have that. What does Octopi data lineage do that? So how does your data lineage differ from what Snowflake just announced with their integration? And how do you... Sorry. Was that the end of the question? It was. Okay, certainly. So first of all, Octopi is, as far as we understand, has the broadest of breadth and depth of technologies that we support. As you saw on the screen here, I would imagine that the ETL data warehouse and reporting tools is going to be a little different than what Snowflake is supporting. The depth of the lineage that you can see within Octopi, as you saw here today, not just one or two layers of lineage, we provide you with all three layers of lineage. And the third, which I really didn't cover yet, setting up Octopi literally takes hours, not days, weeks or months, as maybe some of the competitors might say, literally hours. So does that target extension show where exactly the error occurred? I didn't understand the question. Yeah, there was a... Does that target expansion, sorry, six, show exactly where the error occurred? Oh, no. So we provide you with the information. Of course, we don't tell you where the error could be, but it would give you the information in order for you to be able to then be able to go ahead and correct that. Going forward, actually, we are working on AI technology. We'll actually show you and even bring you to the actual area or the actual space, for example, with its Snowflake-specific column. We are working on that going forward. That will be available. And what relation did you use to connect conceptual entity person to logical entity person? Entity belongs to entity. No goal is connect conceptual model to logical model. I'll have to defer that question to our technical people and get back to the question via email. All right. So, sorry, my questions just moved. Sorry, let me get back to my questions here. So I wouldn't seem that the business terms are harvested from the available column descriptions. If the column descriptions do not exist, can one manually enter a business term definition and lock it so it cannot be changed when the process is run again? Yes, absolutely. In addition to that, if it's not in the reporting system, if you have that kept somewhere, for example, in a spreadsheet, we can also upload that into Octopi. Yeah, I'd actually like to augment that response, David, because one of the most attractive pieces of the automated data catalog is it does allow you to track, I know it's beyond the scope of lineage, but you can identify the data stewards. And that's where the final arbiter of that definition exists. So you can actually run reports against that. You can tag individuals to say, hey, subject matter expert on this table has this definition changed. That's a good point, and I'm actually maybe sure with that right now. So let's say, for example, you have a unit price, and it says unit price in a US dollar, including tax. So like Anil said, you have, of course, the data owners and the data stewards. But you can also just have a chat with them. So if you need to ask a question, just click on the chat button, type in the name of the person, whoever the data steward is, ask them a question, they'll get an email indicating that there is a question. And they have to come back to here to answer that. And the point is, we had a lot of people ask, why don't you use Slack or why don't you use Teams? And that would actually defeat the purpose. The reason for that is because when you ask the questions and the answers, most likely that question and answer will come up again, maybe even the same day. And what happens with occupies, you'll actually have it listed here so that the following users who are going to actually be looking for those probably similar questions can have that and find the answers for themselves. So how is the lineage harvested by Octopi? Great question. Give me one second. And I hope to be able, oh, actually, I don't have that in this PowerPoint presentation. Actually, maybe I do. No, I don't. All right. So it'll take me a little bit of time to find that slide, but I'll explain it in any case. The way it works is Octopi sends you a client and Anil can attest to this. The client setup literally takes no more than an hour or two. In theory, I guess it should. And of course, that's making sure that you have the or ensuring that you have the appropriate permissions. If you do, it shouldn't take more than an hour or two. What you're doing is basically pointing Octopi to the various systems that you're going to be extracting metadata from, such as the ETL, the data warehouse, the analysis reporting tool. We give you full instructions on where we need you to point Octopi to. Once you hit the run button, Octopi goes ahead, connects to those systems, extracts that metadata, saves it into XML format, and those XML format files, of course, can be opened and inspected to ensure that there's no data, which is another point that I want to make sure is absolutely clear. We don't analyze data whatsoever. So there is no data that's going to be going outside of your environment. It will be strictly metadata. Once you've confirmed that those XML files can be uploaded to the cloud, that then is then uploaded to our instance in Octopi. Your instance or the customers in a portal within Octopi. Once that metadata or those XML files have been uploaded there, that triggers the Octopi service to run, and that's where all of the magic happens, where the algorithms, the machine learning, the vast amount of processing power comes to play in order to crunch that metadata and provide it in the way that you cite here today. And can Octopi autotaking of business term of conceptual data model to column in PDD in data catalog? I didn't understand the question. I'll have to defer that one then again to our technical people. We will answer those questions via email after the webinar. I love it. I mean, I will make sure and get those over to you. And can handle, sorry, let me just get in time here. Can Octopi parse Python scripts and display it in lineage? It's another technical question. I see SQL transformations. Yeah. So we do have, of course, we do support SQL stored procedures or any type of actual stored procedures from databases or their warehouse that are supported by Octopi. Python currently is not supported by Octopi. We are looking in developing that specifically. However, as I mentioned earlier, we are in the middle of developing open APIs, which will enable you to connect and read basically any Python or JSON or so on script and extract the metadata from there. Awesome. And I saw a question in here, Anil, on what size, how big, large is your company? Zika was recently acquired, but prior to our acquisition by global payments, we were less than 500. Can Octopi infer or determine all variations of customer ID across a day landscape like CUS, CUS underscore ID, C underscore ID, customer underscore number, et cetera, and show they all mean or are the same thing? Absolutely. That's actually part of the lineage and what I showed you earlier. That's actually, yeah, absolutely. And how about data governance, like complying with CCPA, a client can request and delete their social security number, things like that? Absolutely. So that's one of the main use cases. If you need to ensure that, for example, a customer's social insurance number has been deleted, you need to know exactly where that is found within the environment and Octopi can show you that. So many great questions coming in. Is it required to have relationships, PK, FK, relations connected between logical entities and data catalog tool showing data lineage? Are entities enough in LDM or are we need, are we need relations? That is, again, the first question. Apologize for that. No worries. That's a great question here. So for the best data lineage, what is the prerequisite? Should the databases have primary key, foreign key all set up? Really, there is nothing else that you need to do. As I mentioned earlier, if it's technology that we support out of the box, you don't need to work or do anything in order to prepare for Octopi. You don't need to do anything to work differently in order for Octopi to extract that metadata. The key point here is that it's a technology that's supported by Octopi. If that's the case, that's where algorithms come into play. We just connect to, excuse me, we connect to those sources and we extract that metadata through the analysis that we do with the algorithms, the machine learning, the processing power, the fact that we analyze also all three layers, the semantic presentation and physical layer, we're able to provide you with that lineage. Nothing else that you need to do. How quickly is the harvest done and then translated into lineage also? How are systems connected together? Do you use, do you analyze fees from one system to another? Yes, of course. And even if they're in different locations, those again, major use cases for Octopi, providing you with that lineage. How is it done? As I mentioned earlier, we connect to the very system, extract that metadata, that entire process. The initial setup should take an hour after that. It should take, you know, half an hour to do the extraction automatically. That can be set up to be run on a weekly basis. The upload shouldn't take more than a few minutes. The analysis can take up to 24 to 48 hours. It doesn't mean it takes 24 to 48 hours, but it can take up to 24 to 48 hours. And then you have, for example, most customers will work this way is they'll upload a new extraction of the metadata on a Friday and Monday morning, they'll be certain to have a new version and they can continue working with the development. Any other questions, Shen? Yeah, sorry, I was talking to my mute button. Does Octopi scan and link other objects in the lineage like XML, JSON, Avro, file structures, flat file structures like C, SHARP? So languages in general are not supported by Octopi except for if you want to call SQL, which is actually a language that is supported as in the scripts. XML, yes, is supported. Flat files, yes, are supported in our discovery module, which I showed you earlier. What about source to target maps in spreadsheets? You mean that, I'm assuming that the customer is asking if they have source to target maps in spreadsheets? No, I don't see that being supported. It's not necessary in any case because Octopi will do that for you. If the, actually, if let me answer a different way, if the customer, if that question is asking if we can provide a source to target in an Excel spreadsheet, yeah, I mean, that's going against the reason for Octopi, which is the automation and being able to see that within Octopi, but within Octopi, you can export everything into Excel spreadsheets. And that should answer both both sides of that question. And what's your licensing and pricing structure? So Octopi is priced per, for the platform, there is one price and the module there is one price. There is no price for, there is no charge for anything else. So all of the users can use Octopi with no additional costs. All of the training is included with that. The cloud fees are included with that. Maintenance and upgrades are included with that. Together with that subscription, you also get a dedicated customer success manager. So the moment you sign on with Octopi, we assign you a customer or CSM and they take you through the ropes from the beginning to the end, provide you with any amount of training necessary. And we can get into the specific details on what the costs are. If anybody wants to schedule a call, we can talk about your specific environment and I can give you exact pricing on that. I love it. It's very nice. So can business rooms be tagged like for sensitivity for personal identifiable information, PII and that relations inherited expressed to downstream objects like a materialized view. So when I am developing a report, I can see field XYZ as PII. So report, repeat that question because I think I might actually be able to answer it. If not, I'll have to defer it. But I think I heard something about tagging about, can you continue? Can I take, let me rephrase a little bit. So can I take sensitive information, personally identifiable information and that relations inherited expressed downstream objects like materialized views. So when I'm developing a report, I can see that this field is tagged with as personal identifiable information. Yeah, sure. Absolutely. Within the data catalog, you can do that. You have the capability of tagging. So for example, PII and then of course you can see through the lineage. If you want to see through the lineage, you can actually understand the lineage and then within the automated data catalog, you can see that that column, for example, is PII or sensitive. And can OctiPy scan older programs used for ETL like C, basic Java, COBOL? No, I don't know that there is any technology that still uses, well that could scan COBOL but no, OctiPy does not. Does it have intelligence to show potential data relations from one data source to another? Yes, absolutely. And can you input, is it a CSV import? For, is that an addition to the question that you just mentioned or is it a new question? New question. A new question. What is it? CSV import? Can we import CSV? Absolutely. And that would be supported within the data discovery. So as I mentioned earlier, someone had asked a question about flat files and XML, similar. And I don't see GCP technologies, for example, BigQuery on the supported technologies paid. Do you know if they're on the roadmap? Yeah, they are on the roadmap later on this year. And how does the lineage know an object is a BI report? Is it necessary to import metadata separately from a reporting server? No, absolutely not. As mentioned earlier, there's nothing that you need to do separately or differently in order to prepare or for OctiPy to work. The key criteria is that you're using technology that's supported by OctiPy such as Power BI, for example. If you are, we connect to it out of the box automatically with that initial setup and we extract everything that we need. There's nothing else that you need to do. Can you show an example of your data masking? Um, data masking, I don't know that I mentioned that we have it. No, we don't. Okay. So sorry. So let me let me take one step back. We do not analyze data. As we mentioned earlier, we're only analyzing metadata. So of course, there's no reason for data masking in OctiPy. Makes sense. There's a lot of questions in the chat about data quality. So is there any ability to manage data quality in OctiPy and if so, how? So data quality is outside of the scope of OctiPy. Having said that though, we are working on BI for BI or intelligence for the data intelligence team, the data intelligence, I guess, environment that is going to be developed and released in the next year, which we'll be able to give you insights and ideas about data quality and so on. And if there's not any keys, foreign keys in the database, does it still have to provide the lineage? Is it still able to provide the lineage? Excuse me. I would imagine the question, the answer is yes, because I don't know the exact answer to this question, but I know that we don't need anything other than extracting the metadata. The only other thing that we might need on occasion, if is the connection parameters, is the only other thing that OctiPy would require in order to provide you with that lineage. So I would say the answer is yes. All right. I think the other questions are a bit technical. So we've got, I think that's all the questions we have for now. Well, David and Anil, this has been so great. I will get all those technical questions over to you that we weren't able to get to today, so we can get that included in our follow-up. Again, I'll send a follow-up email by end of day Monday for this webinar with links to the slides and links to the recording as well. Thank you, Shannon. I just wanted to take this opportunity to take Anil once again. Really appreciate it and thank you everyone for joining and listening to what we have to say. If you like what you saw or if there's anything that peaked interest and you'd like to find out more, we'd encourage you to schedule a call with one of our representatives to take you in more detail through everything that you saw here today. Thank you both. Thanks to all our attendees. Hope you all have a great day. My pleasure. Thanks.