 Hello and welcome. My name is Shannon Kemp and I'm the Chief Digital Manager at DataVersity. We would like to thank you for joining this DataVersity webinar, Activate Your Data Lakehouse with an Enterprise Knowledge Graph, sponsored today by StarDog. Just a couple of points to get us started. Due to the large number of people that attend these sessions, you will be needed during the webinar. For questions, we will be collecting them via the Q&A, or if you'd like to tweet, we encourage you to share how this question is via Twitter using hashtag DataVersity. And if you'd like to chat with us or with each other, we certainly encourage you to do so. And just to note, Zoom defaults the chat to send to just the panelists, but you may absolutely change it to network with everyone. And to find the Q&A or the chat panels, you may click those icons found in the bottom middle of your screen. And as always, we will send a follow-up email within two business days containing links to the slides, the recording of this session and additional information requested throughout the webinar. Now let me introduce to you our speaker for today, Naveen Sharma. Naveen is Vice President of Product at StarDog. He is a highly regarded data management expert and seasoned product management executive and has helped organizations achieve significant growth with new product innovation and adoption. Among other positions, he has served as VP product at both Precisely, Formula Wasting, SORT, and Pitney Bowes. With that, I would give the floor to Naveen to get today's webinar started. Hello and welcome. Thank you, Shannon. A quick voice check. I assume you can hear me clearly. Yeah, you sound good. All right, super. Tell me, go ahead and share my machine. This will stop others. Yeah, continue. Okay. Let me put this in slide show mode. Perfect. Good morning. Good afternoon. Good evening, depending on where you are. This is Naveen Sharma. As Shannon mentioned, I'm from StarDog. I run product management, product strategy, tech alliances and partnership here at StarDog. Basically, dulling in from a very cold and rainy day here outside of Washington DC. I'm going to slide into the topic at hand, which is how to activate your data lake house with an enterprise knowledge graph. Key items, just from an agenda perspective on what we will cover today, really talk about the promise of cloud data warehouses, data lakes and lake house. There's a lot of discussion in the data and analytics domain in space about, you know, the best way to organize data, the best way to modernize data and analytics, especially as more and more organizations think about, you know, utilizing the value and the benefits of shifting storage and compute into the cloud environments across different cloud providers. So that's some of the benefits of having that data hosted there. Of course, where, you know, we focus our time and energy on is, you know, what's the, what's the real value and in terms of delivering that last mile, where companies are looking to democratize data, you know, across to a whole host of users for a whole host of use cases, and how knowledge graphs that power a semantic data layer really play a part in that. So I'll talk about just an industry use case. In this case, I'll touch on life sciences by no means this is limited to any specific industry, but it will give you a perspective of how a knowledge graph power semantic layer built on top of a lake house environment can actually really start to generate value for your users so they can drive, you know, meaningful insights more timely insights in terms of you know their day to day operations. And to keep it real. I've been told that this audience loves live demonstrations so I will do my best to do as a hopefully time permitting, walk you through an actual scenario how this comes to life as well so and we'll leave enough time for Q&A. I'll jump right into the next slide. So again, a lot of investment going into data and analytics domain, both within the enterprise, looking at ways to not only save on infrastructure costs but also get to a point where all data that is global can be leveraged to drive meaningful insights and that's key because we know study after study and this is just one of them reflects the fact that research, you know, businesses that are actually built on those types of insights from data, average at more than 13% annually, in terms of you know their own growth, but also eight times more likely to grow faster than the global GDP. So this is and again this is, I'm sure you can find a study out there that you have read that articulates the same same set of metrics. And of course, when we look at that, and really where most companies have started on that journey in terms of their modernization efforts. It's been to bring all that data in a place that you know reduces the total amount of friction and getting access to all that data and doing it in a way that helps them derive insights that then can be turned into actions, you know, inside of the enterprise in terms of how they engage when they engage, what products and services they offer. So when we think about it really it's the adoption of cloud, looking at ways to reduce the amount of infrastructure that needs that is needed to support and pay for that all upfront as a as a capital expense, versus looking at clouds in ways that you can actually, you know, turn this into more of an apex pay for what you use type of operation. So it has been has been a great great driver of moving a lot of that data analytics infrastructure away from, you know, more captified data center type operations to two more, you know, publicly available cloud cloud infrastructure. It's just cheap. So why not bring any and all of that data beyond just the infrastructure, bring all of that data in any form structured semi structure unstructured. And of course operations like you know moving away from traditional etl to you know ELT with technologies out there that have made that easy to bring that information and data inside of those environments inside of those environments have really really you know much of the movement of data into those cloud storage environments. Of course, the big thing there is with leveraging cloud infrastructure is you're also driving economies of scale with respect to what these cloud vendors are able to provide both in terms of, you know, data ops, you know, monitoring of data, operational testing out infrastructure, you know, moving from test environments production environments, utilizing elastic compute, but also providing security at every level, you know, from data at rest to data in motion, and are able to do so and add and deliver those services, given you know many companies are able to leverage that infrastructure, you know, in a sort of a multi tenant way that that promotes the ability for these companies then to offer these services at a much cheaper cost. And of course, this is also the basis of how you know rapid innovation is unleashed, because the focus is less on putting together this infrastructure. It's more more automation there. Focus on autonomous and augmented intelligence, how it helps alleviate, you know, where where the time and energy spent to to where the time energy needs to be spent which is in the analytics in the analysis in trying to find, you know, new opportunities to innovate with a new product, new service, new feature and capability, and that that's again part and parcel to how data is made available through these through these cloud infrastructures. And again, you know, cloud data warehouses is a company to come to life in support of business intelligence needs. At the same pace, you know we've been produced data lakes, and the benefit of data lakes to support artificial intelligence and machine learning to help build, you know, more forward looking models based on a lot more data that that wasn't previously always available is now readily available to run those types of model development and and and then having those models actually be utilized inside of enterprise processes and the business processes on a day to day basis. Understanding that this is a good start where we've also seen is where we've started to create this sort of dichotomy of value creation, where data, while made available and simplified and cheap. In terms of your day to day operations with cloud data warehouses, you know to drive your your business intelligence needs your reporting needs and data lakes on the other hand to create that the notion of machine learning and machine learning model development, utilizing data in any any form utilizing all data. We also created a bit of this dichotomy where these two areas are really incompatible in terms of value creation. And when I what that really means is, you know, we end up creating data sprawl challenge problems challenges because at the end of the day, you want to move data to a big cloud data warehouse to structure data a certain way to optimize your reporting and business analytics. But then if you wanted to move a lot more information for machine learning, you're creating more brittle ETL pipelines, frankly, to, to, you know, exacerbate the problem that you're now creating with multiple physical copies of the same data. And then in governance and security challenges that sit on top of that either have to be replicated, duplicated, or oftentimes, you know, not easily, not easy to to operate across, you know, both fundamental areas of data and analytics and then latency challenges that come along with slow update anytime there's an update in your operational system, you're still moving data around into either a cloud data warehouse or data lake. And then ultimately data quality is still questionable. Yes, you've gotten data moving there faster than previous ways from previous times at the end of the day, you still are not sure about how good and clean and valid that information is in order to be beneficial to your use cases, or your analysis. So, you know, with the lake house architectures, you know, I've started agreeing in prominence from vendors like data bricks, and making essentially the point that these two notions ideas of a data warehouse and data lake really need to come together into a lake house architecture, where all data can be made available for analysis whether it's for the purposes of bi reporting, or for data science and machine learning, where one security model can go govern data at all levels and then even fine grained role based access controls can be put in place. So you then send up supporting use cases across a multitude of users, both in the bi as well as on the ML domain side of things, latency is reduced because again you're not making copies you're not making physical copies and moving that data around from one system to the other. So you are delivering higher quality data to that analytics environments, and the updates are turned around faster again because you know you're not waiting on those data movement to occur through those brittle etl pipelines. So again, the data lake houses, you know, early days, but a lot of promise in terms of you know bringing these two worlds together collapsing these two works together to make it more compatible. But we also know that despite best effort. And again, I know this is early days for a lot of organizations that, you know, 70% of the organizations want to be more data driven now. Yet 95% of them still struggle with operational challenges around data analytics, and 88% continue to be hindered by legacy technology so we know that legacy technology is not going out anywhere. And there is applications that power sort of your operational environments, you know still sit on the big mainframes you know there's still, you know you still have applications that have been homegrown and homemade for certain industries, and to replicate their and their process workflows is an ongoing challenge so ultimately that struggle to bring that insight from some of those systems. Also combined with the fact that you know we are now operating in a multi hybrid cloud environment, you know where applications are sitting across different cloud environments just makes those challenges you know a lot more. A lot more harder for companies as they embark on this journey of gaining more better insight in democratizing data and insight to two more users. And really when we look at it in terms of where those points of friction still remain when it comes to sharing data and knowledge broadly. And then fundamentally boils down to these five buckets so if you're a data analytics practitioner, you know these these five areas, the categories should resonate with you right from a data culture perspective. We still operate with the mindset that we got to bring all data together. So the focus is on big data operations just focuses on data collection focuses still on data centralization. The role in the hands of specialists and that data culture obviously is is something that from you from from past environments if you think about it that was sort of the model that people had when when they talked about centralizing everything into a massive enterprise data warehouse. And that's still propagates throughout a lot of this new modern stack. You know infrastructure roll out data models themselves are tightly coupled and shaped by the underlying data storage infrastructure so at the end of the day it's all about how do you optimize the type of information that you want to deliver to your users, and it's optimized for writing your queries so it's, you know, even though the data may not be represented in the real world this way but it has to conform to some third normal form in the database and in your traditional DBA modeling paradigm where you know the relationships between tables need to be one to one one to many. And then and then still very much ID driven it driven driven exercise data integration, still very much fundamentally more about ETL and ELT pipelines you got to make physical copies and and have those physical copies shared across different systems. Data interrogation is still very much driven by you know this creation of this notion of tell me what queries you want. We will create those queries for you. And those queries of course will be limited to processing data within a single database so again this notion of data centralization and optimize optimized queries for that system. And then in sort of notionally identifying what queries the basically what questions need to be asked upfront before that that you know query is actually created and written. And then from a data intelligence standpoint, you know, there's a lot of, you know, a lot of work that's gone into the whole data catalog domain. You know, capturing the technical metadata separately, making users available, you know, sort of have have visibility into what metadata is available. Great, great start, but it's still very passive in terms of the analytics and the value delivers a great place to catalog all the information. It's a real way to activate that information to drive recommendations right so those are the challenges in terms of where those friction points still remain. And if, if I may be bold enough to drive and discuss some of those opportunities that there are out there for for organizations as they embark on their data analytics journey. You got to create a culture of the data that is focused more on the lesson data collection more data connections, because that's where context lives that's where value lives is between tooth entities and how they connect and relate. It is about federating data, not centralizing it it is about sharing data not controlling it. And it is also about focusing more on what we call white white data, white data essentially means that data that sits across domains. Right so you may create your central domain specific, you know, notional views or cubes, but ultimately and in optimize for best performance ultimately where the value sits is in being able to connect the dots across these domains. So the data model, again, abstracting out the data model from the underlying data structure and representing it through concepts of business meaning so business concepts through a semantic layer, enables data uniformity and linkage. So that's an important aspect of how a data model needs to be thought about right so decoupling the data model from the underlying and data infrastructure. So that it's not shaped by the underlying data, but it's shaped by the value and how you as as the business persona interpret data and information right just the way we are as humans we know we to ask for. Supplier assets and products, you know and and skews it representing that in the semantic layer in a way that has business meaning is though is a is a is the opportunity essentially to think about in constructing these data models. Of course there's no one model that suits everyone right so how do you, you want a semantic layer that can enable and support a multitude of users and multitude of use cases, where you may shape the actual data model to fit your business need. At the end of the day utilizing the same underlying data. You're not making copies you're not creating cubes you're not creating any of that data integration. Well, not everything has to be moved and copied over data virtualization offers us, you know a lot more power now in terms of the being ability to limit data sprawl. Being able to limit complex data pipeline development and frankly also enabling access to data that is more more just in time or real time to support faster decision making so you know there is reduced latency when you are looking at data that sitting in an operational system buying that data that sits in your analytical environment versus making physical copies of that waiting for that to to to occur and process before you can actually look at that information data virtualization comes a long way and supporting that. Of course, enabling better, you know, faster search and discovery, where you can actually run complex queries across a heterogeneous set of environments so you're enabling more search driven data exploration, rather than forcing some predefined query execution. And then lastly from a data intelligence perspective, it's really about inferring relationships to drive intelligent recommendations because you're now linking the metadata and linking that up to the semantic model itself. And that is more active in terms of its usability and value to the underlying data analytics infrastructure. So that's the, those are the side of the opportunities that did want to reflect on this. This is really enabled through what we call a knowledge graph powered semantic layer and really facilitating sort of that last mile so Lake house has come a long way in bringing all of that data together bringing the two colliding the worlds between your traditional data warehousing and data lakes. The knowledge graph powered semantic layer is sort of that next level up in terms of delivering value back to to the users that needed the most and and looking at data in a way that they conceptually understand. And how they relate in context of their specific use cases right so. So in, you know, Godner put this put this comment out there as part of one of their reports in a data fabric one of the most important components in the development of a dynamic and composable and highly emergent knowledge graph. It reflects everything that's happened to your data, and this core concept enables the other capabilities that allow for dynamic integration and data use case orchestration. And we firmly believe from a perspective of where knowledge graphs play a central role in the data analytics ecosystem. This is really part of that data fabric foundation, and a lot of companies have started to embark on. This is an enterprise knowledge graph it's really a flexible semantic data layer for answering answering complex queries across data silos very simply put. Of course this is where we unified data and metadata using the semantics layer that's that's in place. You can it becomes a living thing so you're actually evolving the semantic layer as part of your data fabric, and you're delivering more context and rich data because you can deliver new relationships and new meaning from, you know, just logically connecting the dots if a equals b b c is likely to see, and that knowledge and inference can can can be driven to back to the user as a set of recommendations, or push push back into your own existing systems and workflows. Let's look at it from a real life example. In the context of life sciences. This is an example of a major life sciences pharmaceutical company, you know the big challenge that they were trying to put that they were working with and dealing with was sort of lack of broad availability of internal and external data for decision making stakeholders, you know so who are the critical stakeholders. These were folks in the research area, clinical development area, their commercial area the safety area and the regulatory area so these were the different stakeholders. And of course, you know, the types of questions, or the issues or challenges that were reflected in each area within that organization was sort of the, the, the impetus to to embarking on. And bring data accessibility solve for the data accessibility problem. And whether it was on the research side, trying to understand the average, you know months it takes to to target identify and validate something, or from a clinical development perspective, you know trial design and execution time cycle time can be faster. And how do you, how do you make that faster from a commercial perspective you know sort of missing omnichannel framework, or limited coordination between Salesforce and other channels from a safety, you know adverse reaction perspective understanding, you know what's how do we mitigate that adverse effect of a particular medication. You know what where all this data sits in this information and all it sits across different systems, and then from a regulatory perspective, you know just getting, you know, all the information data that's needed to get regulatory approvals. How do you make that process go faster and much more smoother. And that's sort of where you know this is again a high level pictorial representation of the set of challenges where again data that's both inside the enterprise, but also data that's external. And really the end state and I want to bring the end state right up front was bringing a lot of this data and information and co locating that within the data lake house was sort of the step number one for them. And then they called it as part of the enterprise data fabric, and then building a semantic layer where so the foundational layer was great for all the data curation, all the data duplication. And, and, and having all the data co located that became very important for them within this data lake house foundational layer, and then the semantic layer became this sort of area where they harmonize data and brought all of the insight from the various components at the semantic level where the business users who really wanted to benefit from this, you know, in the context of their different departments of research, understood what molecule components were what compounds were what studies and trials were what regulatory meant what toxicity meant what adverse effects meant. We go ahead and ask those questions, much better than they would otherwise right so at the semantic layer, they're able to connect the dots between the effect of you know an adverse effect of the medication to down to the actual set of compounds, and what those compounds, in fact, are where they might be using other medications that may have other similar effects on on the consumers themselves. So, being able to create that and support the different stakeholders was certainly step number one. What was also important was that they were able to reuse this platform to scale the sort of their digitization across the entire drug development process so so from R&D to pre clinical the clinical the regulatory to post market. And to put together these domains as you know as part of the the entities that they called within the semantic layer. So drug target identification compound repurposing scientific search drug target validation auto reporting adverse effects traceable supply chain. And that was linked up together. In order to support these, and in fiction infectious disease planning. All of these were able to bring the sort of the entire knowledge of not only what researchers were doing what studies were being conducted conducted, but compounds were being utilized what medications were involved, you know, who are the suppliers and when was the last set from a supply chain perspective from a distribution channel perspective, connecting all the dots became a critical value creator for for the organization itself, which meant they could then begin to ask questions like, are certain genetic conditions suitable to be treated with a particular drug. And the expression be used as a biomaker to understand whether drug is a delivering an effect, which compounds have been tested in similar conditions with similar treatments. You know show me all the lots, all the lots of raw materials and associated suppliers involved in the production of a finished good lot number 123. How do cogs for product and compare between these two regions and which manufacturers have supplied the raw ingredients involved in a particular customer complaint so again, looking at it sort of more realistically as part of the enterprise data fabric. Bringing this knowledge that's connected across both these internal external sources, they are now able to in a position to ask and answer these questions, much better than they have, they've ever been in the past. And so again, so you're, you're, you're democratizing the knowledge to more people more users inside of the enterprise that don't necessarily have to come with some sort of specialist skillset in order to look and look through this information and and understand the entire data supply chain. With that, I'm going to pause here and I'm going to pivot over to, as I promised, an actual live demonstration of how this would work of course I want to take this extensive use case that I've just shared with you live here, but but a very simple one. Again, a different industry in this case an insurance one to get the same point across in terms of what it takes to build one and to get started with one when you're when you have a lake house foundation in place. How do you kind of activate that from from a semantic layer to support a whole host of use cases. Yeah, I'll get right into it. A few things. You know, this is an insurance examples of kind of mimicking an insurance risk analyst, you know, is looking at data and information that's currently sitting across different silos. They need a complete profile of customers financial situation, including all the assets, you know what vehicles they own, maybe asking question like you know what's what what risk is associated with flooding and fires, etc. And for the purposes of this particular demonstration. This is a startup platform which is our enterprise knowledge graph platform has as materialized some data from some CSV files from a JSON file, and also will be virtually virtually connecting to data that's sitting in Databricks of Databricks itself will be connected to the data in any queries that get asked will create push down queries is as part of the sequel endpoint that Databricks recently released. So actual actual computers happening at the source. So we're not copying any data into the knowledge graph from Databricks in this scenario. So let's get right into it. Let's switch over here. And let me see if I can move this around a little bit. Okay, so I am connected to our enterprise knowledge graph platform in the cloud. We have three different applications here explorer for us to be able to view data that's that's been connected all the knowledge has been connected designer is actually where we're creating a no code visual environment for for creating and maintaining your sort of your knowledge engineering process and then we also have an ID for for more advanced system admin users as well as advanced users who want to be able to look at the underlying data infrastructure create their own queries and write their own queries to for that purposes as well. But I'm going to jump right into start off designer to give you a flavor for how we would someone would go through this process now we'll create a new project. We'll call this web for the purpose of webinar I'll just use the webinar name. So, so I am connected to the startup server in the cloud, I'm going to pull in a model that has already been defined working so you can imagine working with different users. This model was put in place. So, bring that in. What you're seeing here is, you know, basic model that talks about insurance data that's coming in from the claims management system customer data from the customer system insurance policy from the policy admin system you know along with address information and quotes that have been created for that particular user. So the idea here is that you've sort of some you know created this sort of business, meaning driven model that is representative of the type of information the users are interested in, and value, right so. Each of these bubbles kind of represent the entities or the specific business concepts, and then there are specific attributes attached to this for example for claims I've pulled in that amount paid data loss policy number. For the customer information or insurance policy I have the policy number and the premium information and maybe even some information related to the quotes themselves. Now what I've also done like I said is, I've preloaded some of this data into the knowledge graph from some CSV files from a JSON file. And I can inside to actually look at this information already so if I go back into my startup cloud, I'm going to explorer. And I'll pick up an example. I'll connect to the same database of course I'm going to connect to my demo underscore Naveen Sharma database. So let's say I pull up a customer, say Bob styles search. And I see Bob styles is here I see all the information related to Bob, you know data birth email label owns certain information what claims have been paid out. And I can also start to kind of explore this in a graph visualization way. I can kind of see cost Bob styles is a type customer so I'm actually tying the actual instance of this customer class or customer entity so this is the metadata at that level almost the vocabulary and the ontology different ways of saying the same thing. This is an instance of one of those is happens to Bob at that address you know with specific claims that are associated with that individual so I can again already explore some of this data. Now, what what the insurance risk analyst wants to know is you know what assets are associated with particularity with Bob assets being you know vehicles in this case. And I've been told the vehicles exist inside of data bricks so I'll go back into my designer environment and start to kind of look at ways I can bring that into my model itself so part of it this is I'm going to create a new class. So I decided to kind of do that so we'll call this vehicles. This is a description again for the specific business concept that I'm interested in and I will pull in some MSRP data, create that, and I might even put in some model information in terms of what model car vehicle it is. So basically a simple class with two attributes, nothing to it. And then I may want to be able to say customer relationship customer owns vehicles so I pulled that in. And that's a relationship I've established between the two. You know, again, customer owns the vehicle relationship. And what I want to do I want to be able to pull in data, virtualize that from my knowledge graph. The other thing, and I'll get to it as well is within my tableau report. I already have access one of the things we were able to do is actually push this data into the systems that you're most familiar with so there is a BI SQL endpoint that makes the knowledge graph look like a SQL source so if someone wanted to pull the same information into a tableau report, you know they're able to kind of do the same thing. So that's another way. If someone want to look at this information so I have the same classes entities address claims report customer insurance policy quote, that would represent it in my knowledge graph as business concepts are available as tables here. So for example one, one of these. I can kind of see the specific information related to that so this information is coming directly from the knowledge graph as well. So I'll go back to my explorer I want to now create add a new project resource. This time I'm going to pull data in from data bricks. This is an environment that's sitting in a separate cloud as your and I'll pull that in, call this resource data bricks. We'll select from insurance insurance. We'll see, make sure this is actually something we can fulfill and this is active in the table. And depending on whether or not this particular instance is up. And this always is an interesting time to see whether I can actually pull some of this information or not. Good news is this information is up. This information is available to me so I can see the model the color of the year the mileage sales price information has been verified and validated. So I can create this resource and bring that into my semantic modeling layer. And when I do this, this is the exercise of me kind of beginning to map that out so if I wanted to start mapping this information out to my vehicle I can start to do that. Let's call this mapping one. For now, and we will map that to the vehicle. And here I know what my primary identifier is is sort of this ID. And I'll go ahead and map the model from the actual data bricks table to my semantic class that I've already defined the business concept that I've also had the sales price come in and I'll map that to the MSRP. I'll set the label so you can actually look at it as a model name will stop there. And then I will also create another mapping here to identify the actual relationship. So that I can make that connection as well. And we'll call this your owner ID map to the ID itself and that relationship is established. So I have that modeled in from from data bricks. This becomes a virtual source for me to now look at this information I can go ahead and publish this. Out to my startup server. And that, you know, ask me where I want to publish this I'll go ahead and publish this and then the same demo that I've created before. I'll publish this out. And that is now published enough I go back into my Explorer view. I can start to pull some of this information back in. Of course, I need to probably refresh this so we have this information connected to the source and actually just from the form experience standpoint. I'll go ahead and close this out so you can walk through the same experience we did before where we start from a particular user if I wanted to do that. Again, I'm connecting to the same database, which I could have refreshed. But let's go ahead and so as a customer is Bob styles. And we can search this same thing here we start to search, look at information. As we're doing this. I'm also trying to connect here and make sure all my settings are connected so I know I have a default graph in place. I'll use that webinar one tool save that. And what I'm starting to look at is information from the virtual source. And, and I see as the model. A particular name graph again the notion of name graphs are ones that the that have been created for my specific use case which in this case I don't have one so I'm going to just go ahead and use that particular one. And that generates this sort of virtual view and connection back into the particular graph that I'm pulling in on this case I modeled this in terms of the webinars I'm going to save that. I'm going to go back and say okay I want to look at how you can see I have the vehicle class defined before I did not have that defined. So if I look here for customer. Same thing. Bob styles. Let me see. I want to make sure I have all the data in here and it's refreshed. I'm going to go in and look for particular one. Sorry, I forgot I didn't refresh this actually. The other thing I also started to see if I go back in here in my tableau. The same information that was available for me running before I can start to see that showing up in here as well so now I've pulled in the same information. So I was sitting in my knowledge graph now pulling in information from the actual particular database as well. So there's different levels of ways I can connect to this I can virtualize this and show this through my tableau. I can also connect this through my Python API and make it available inside of my data science notebook. And if I go back here sorry. I can come at it and then inform more from a research perspective. This is again a user that, you know, that may not have access to either data science notebook or vi table but actually wants to start to explore data. So again a very simple use case of being able to do that through through the lens of start of Explorer. One of the other things. In the essence of time I'm not going to really work on right now, but don't want to make sure you understand you can see from from this perspective here. I'll pull up these slides. You can also run what are what's called inferencing. So, in the case of inferencing, I'm actually able to connect and infer relationship so where I may have a relationship where taxes were assessed. In particular address that is owned by a customer. I don't have to have this physically manifested in my semantic model as a relationship that's explicitly explicitly defined. What I can do is actually run inferences based on a rule and rule says customers who own an address must owe the assessed value of those taxes. And so having that in relationship inferred the fact that these taxes were assessed for a data address that belongs to this customer. We can actually get in for a relationship that customer owes these particular taxes. So that's the power inferencing actually running on the semantic model, where these relationships don't have to be explicitly explicitly defined in the semantic model itself. I'm going right into it and this is again, you know, the talk of throw the notion of where you can actually supercharge your own analytics right so I gave you a very simple basic example of a tableau report or a very simple view of the type of data I have in my specific model. You can have other information, you know, things like when it was built, what the property was assessed for, what type of insurance risk are associated with that particular location, you know, in terms of floodplains risk zones. Again external data you're pulling in. And that information can be researched and search through the Explorer view can be utilized inside of a reporting tool like tableau, or within your own data science notebooks as you start to build machine learning models to our Python API. Moving right along so really the goal here is to help close the last mile with a knowledge graph powered somatic layer. The three key takeaways here for you just to understand number one, you don't have to bring all that data together inside of a centrally located repository. You can still include all the data you need some of that data absolutely can be virtualized without making physical copies. This is very, this was very helpful and you want to, you know, support ad hoc data analysis or look at addressing, you know, some new what if scenarios or new challenges and you know you don't have to go through the whole exercise of you know extreme pipeline development. You again model the way you think, right so you think about the world a certain way, in terms of, you know, the business concepts, and that's how you create that model sort of in a whiteboard style, canvas style model development exercise, and that becomes your kind of somatic layer. And it's, it's sort of abstracted out from the actual underlying data structure, and then you're able to uncover new insights and these insights can leverage you know against logical inferencing or statistical inferencing using machine learning to infer new connections between the data, regardless of you know what the where the domain is and help you uncover new patterns. What is it all result in results and outcomes. And that's the way from all the analysis we've done working with our clients, improving productivity of your teams, both on the data analysis side as well as data science side, you know, you know, making sure that you can bring new product and services quicker to market. So there's a shorter time to market aspect, and potentially create new avenues of revenue resources or revenue sources for yourself, you know, so many new revenue streams that get uncovered. And all of that is only possible when you can bring the knowledge of all that data that is available, and connect that knowledge up to the somatic layer and deliver it to the people that are making those decisions that don't have to necessarily be the experts in a particular technology, or frankly even you know experts and as something as basic as writing sequel code right so. So I'll leave you with that thought. I think, from my perspective. This is sort of the last slide here, helping really modernize your analytics supercharger analytics is a key use case of knowledge graph power data fabric, accelerating your investments in data lakes and the lake house, as you may have done with data bricks and the likes, or your Amazon SS three and powering sort of what we call somatic search based exploration this could again be not just through the tool that I demonstrated but it could be within your own applications through a graph ql based API, as many of our customers have done within your own homegrown or caught so caught applications and power recommendations and recommendation engines right so knowledge graphs are great at at you know bringing bringing together knowledge. As we talked about you and just from a pure data analytics perspective inferring by linking metadata to the data you're able to drive new recommendations as well so. I'll leave you there, Shannon, this is a probably a good time to see if there are any questions I can answer, and certainly happy to answer as many as I can with given the time we have. Thank you so much for a great presentation there's lots of questions coming in. If you have questions for Naveen feel free to submit them in the Q amp a portion of your screen. And just a reminder and to answer the most commonly asked questions, I will send a follow up email by end of day Monday for this webinar with links to the slides and links to the recording, along with anything else requested. So diving in here. With respect to the case study presented what was the timeframe for integration and implementation. That's a great question we typically find that most customers and frankly whether they're thinking this way or not our sort of prescribed approach to tackling these challenges is certainly coming at it from a lens of what are the business outcomes. They're looking to achieve and you know what part of the organization so when we talked about the different stakeholders, the research group the clinical group, the commercial group. We almost look through the organization and say, who's got the most to lose by not having access to all this data. What analytical insights or what types of questions are important in driving that outcome or value to the business, and then looking at it from that perspective and that lens and saying, Okay, what's the specific set of business concepts that there would be and regardless of what the location or structure of that data is so again approaching this from a use case centered. You know creation of a data fabric to then evolving it to a large and enterprise data fabric that can promote some more reuse and share. So we see anything typically from two weeks to three months, you know and labeling a set of use cases to you know projects that, again, depending on how much they want to bite off of at a single go can take, you know, a couple months for the next use case and then a couple months for the third use case so. Yeah, yeah, so how does the tool interface with data catalog tools. Are certainly the companies who have invested in data catalog tools, you know we don't want them to, you know you have start from a blank canvas right so the power of having a data catalog tool available, like a like a calibre or like unity data catalog that Databricks just recently launched, for example, those data catalog tools can serve up some of the metadata that already has been classified as the basis of the specific business concepts that we're going to model. So if a business glossaries in place for example and you've already made a definition you've already defined a customer you've defined a supplier you define an asset. We can ingest that certainly to fast to accelerate the knowledge engineering process. And that's a great way for tying then the underlying metadata in the data catalog to the actual model that will will get queried by the user at the semantic level and so when those queries are coming in. There's a bidirectional value back into the data catalog because we can now share, you know which, which business glossary items or concepts are being referenced most how often by who. And then, oftentimes when a particular business class or concept is being asked whether their business concept or classes are being returned in terms of the query response itself so that kind of pushes that knowledge back into a data catalog so that they can kind of the value there where they're collecting everything and anything about that specific metadata. Other ways that we've seen clients benefit this is to be able to look at how do they, how do they expand this to look at the entire data universe sometimes depending on which data catalogs you're using. You know you're not able to really visualize and expand out all the data that that connects through majority of these systems. So, so being able to visualize it as a knowledge graph is a great way to be able to bring that that data catalog view in front of users in terms of value. Awesome is so many great questions here I'm going to try and get to as many as possible. So with respect to environmental social and government or governance what to what degree can a data lake house facility and organizations ability to better monitor and manage energy consumption and carbon footprint reduction. Okay you you have to restate that for me one more time. Yeah sure. With respect to environmental social and governance to what degree can a data lake house facilitate an organization's ability to better monitor and manage energy consumption and carbon footprint reduction. So let me look at the end of the day that the data lake house is a great collection point of a lot of that information that's been gathered. That knowledge then has to get enabled and in the hands of the right folks that make those decisions so being able to take all of that knowledge and data. So let me move into the semantic layer and make that available so that it can be actioned upon by the users and I mean we have organizations that do data sharing sort of as part of open data standards. And this is a great way great great way to kind of promote both a, you know, an ESG type of ontology that defines them as common book, you know, definitions of vocabulary that can be shared by all companies looking at this in the way they in the way they report that information in the way they consume that information in the way they collect that information so having it at that level makes it makes it that much easier. And then of course data sharing because now you're able to, again as part of maybe an open data initiative, you know you're able to make that available as a set of knowledge graph queries that can be asked and answered right so, you know where are we in terms of progress, you know what what more has to be done who's who's behind who's not provided information, you know what information is available. All of those questions can be easily and readily answered. Thank you. I like an open data initiative open data consortium around ESG. Lots of opportunity. Yeah. Um, what is the difference between a data lake and a lake house. Yeah, I mean look at the end of the day. This is maybe at the top of the presentation I touched upon the key fundamental difference that you know we're seeing in the market data lakes are great in terms of co locating a lot of data together, you know they still have challenges in terms of latency. They still have challenges in terms of security that you know at least at the fine grain level role based access control and in quality right quality of data is still questionable because yes you storage is cheap and data lake is great because data lakes allow for bringing a lot of data on the cheap in one central location, but you know where they still lack value you know value or lack the ability to deliver value is, you know you still don't have good governance on top of it, you still can't fully recognize all the metadata that's available for for use, what the quality of that metadata is. And then and then of course the latency challenges of you know putting that linking that up into some sort of an operational system or reporting system that can query all this information in ways that can be readily and easily consumed by by knowledge workers across the enterprise. So that's where I think the lake houses certainly helped certainly help address some of those challenges within the data lake and of course data, the lake houses are certainly a great step forward you know where where the semantic layer on top, you know it really starts to address that last mile in the in the power to democratize data for two more users. I'm going to switch. It's trying to get in one more question here. And this graph to all the same as a knowledge graph. So graph QL is a is a way to to represent a piece of what's inside of the knowledge graph into an easily consumable application programming interface so when you are when you want to connect that your own applications you know think you're going to get a restful interface for graph type queries. It's an easy way for you to have that. It's a contract data access endpoint into the knowledge graph. Now the knowledge graph itself has both a, you know, typically a persistence layer underneath, but in an enterprise knowledge graph like ours you know while we support the ability to persist data we also have the ability to virtualize data through any queries coming in either through graph QL, or through to tableau through our SQL endpoint or through Python and the data science notebook, and the data might be sitting data breaks as I demonstrated in this example, we're pushing down we're taking that query pushing that query down to to to the actual source itself, without having to materialize all that data inside and persist that data. We have our connectivity questions here too so if you have a link that I can send in the follow up for what startup connects to that be awesome, you know, like the sake. And can this tool get content from leading from from data modeling tools such as Erwin and your studio. Okay, so look at the at the end of the day we can connect to anything that can that supports open standards JDBC we can connect anything that provides a restful interface. There are many different ways to do that certain systems are a little more closed in and proprietary and we've made efforts to connect to those systems directly. But then again we have like a whole host of 150 plus connectivity options that we're happy to share as a link. And there's also pricing question here which is always a great question and a great sign of interest. So that's awesome and we'll get you links to those things in the follow up email as well so I'm afraid there's so many great questions here in the follow time that we have for this webinar, the mean thank you so much for this great presentation and thanks to start out for sponsoring today's webinar. Again, I will send a follow up email to you all until registrants by end of day Monday for this webinar links to the slides and recording as well as the different links to the additional information you all have requested here so the mean thank you so much thanks to our community love it. Thank you cheers. Cheers. Have a great day everyone. Thank you.