 Hello everyone and welcome to our next EDW session called City Furniture Sales Home Furnishing Business with Data Virtualization which will be presented by Ryan Fattini and Inessa Gerber. Ryan is the Director of Data Engineering at City Furniture and Inessa is the Director of Product Management at DeNoto. All audience members are muted during these sessions so please submit your questions in the Q&A window on the right of the screen and our speakers will respond to as many questions as possible at the end of the talk. Please note that there is a linked form at the bottom of the page titled EDW Conference Session Survey. This is where you can submit sessions feedback and we encourage you to do so. So let's begin with the presentation now and thank you and welcome Ryan and Inessa. Thank you Lewis. So as Lewis mentioned I'm the Director of Data Engineering at City Furniture and I'm going to tell the story of how City Furniture, a Florida based furniture company, went from a data team of basically zero to a team of 20 and growing in a matter of about two years with a data culture shifting from a more descriptive and reactive strategy to a proactive, predictive and basically an aggressive real-time approach. Quick note about City Furniture, if you've never heard of City Furniture we are a Florida based furniture retailer. We have 33 showrooms with an annual revenue approaching one billion with plans to expand further north into Georgia and even out west in the next five years where basically our five year roadmap. Our data story begins in 2018. The current state of the City Furniture 2018 data architecture looked something like this. The nucleus of the City Furniture business is the ICRI's eye. It goes by the name AS400, the green screen. This is basically City Furniture's source of record, the transactional machine that handles most of the core business. We have several other data sources from different applications, serving different functions, different departments, things like sales force and high jump, etc., etc. If anyone's unfamiliar with what the ICRI's eye is, it's IBM technology. It goes back decades. City Furniture has had this implemented since about the 1980s. Overall it's a pretty solid system but it can be limiting in a lot of ways. And there is at City Furniture an ongoing tech transformation, basically an effort to migrate away from this ICRI's eye system and part of the tech transformation is data. So in 2018, our data team represented zero engineers, zero data scientists. We had one digital analyst. We didn't have an enterprise architect and we had our software teams essentially building the report logic and maintaining the reports which were delivered through the school flex system. Researchers were primarily senior management. So you're talking your C-level people, your VPs, or basically the consumers of these reports were essentially PDF files that got emailed out on a regular basis. Obviously this is far less than ideal. As you have software teams, they're supposed to be maintaining and developing your core technology, having to maintain reporting, they're not analysts, and any of that reduction in resources towards our core systems is always a negative. So the companies are thinking about data and the first step was we needed a data warehouse basically. So the data warehouse presented a few problems. So one of the problems was our ICRI system was designed decades ago, basically when space was at a premium. So we're talking about numeric dates, non-standard formats, no primary keys, no key constraints. Space was a premium where now it's the opposite really. Space is cheap and time is the premium. So the time complexity problem has kind of shifted. But because of this data design, because of this design back in the mainframe days, it presents a lot of problems when you're talking about real-time replication and data warehousing on the next slide. So the solution was an IBM tool called Change Data Capture. So in the research and planning phase, there were several different items, several different tools, several different options, several different warehouses, and these things got debated over and over and over again into their gory details. The Change Data Capture tool from IBM seemed like based on city furniture, very unique data problems, data issues, and this data design that was kind of mainframe design space related was solved by this Change Data Capture. At least it was the best fit, at least from what we found to solve these types of problems. And the architecture was very simple. Linux server installs Change Data Capture on the Linux server, replicates from the source DB2 agent to a DB2 agent on the server, which on behalf of the i-series agent passes it on to our DB2 cloud, which was our cloud warehouse environment, which was part of the IBM stack. And it worked relatively well. A few problems we've encountered that we had to solve to make any of this thing happen. One was the lack of the primary key constraints. So if you're talking about you have no primary keys and you have no constraints, how are you going to replicate real-time data? You need to make an update, you need to make a delete. How are you going to look up anything if you have no confidence in your keys? So we basically had to create scripts that could take a table that has no compound keys or no primary key set and run algorithmically or recursively to determine what the smallest set of compound keys would be. And we used the results of that algorithm to then set our keys. That wasn't good enough either because with no constraints on those keys, those are keys that we're basically determining algorithmically, but there's no source constraints. We still could run into the problem where those particular key values change. So because of the Change Data Capture system and because of how it's journaling, how it's scraping the source, we're able to actually pass in into our triggers the old values and the new values. So one of the ways we were able to replicate this data in real-time with no compound keys and primary key constraints was to pass in old values and new values into the triggers to look up to the old values and then swap in the new values. And this is the way we're able to maintain this real-time system. Currently our data warehouse has roughly 150 tables replicating and to make that happen in some of our complex roll-ups of those staging tables, we have about 150 store computers. So it's a relatively complex machine to get this source data replicated, conformed, cleaned up in a way that any analytical value whatsoever. This brought us to basically what I would call city furniture's paradigm shift. So we have a real-time data warehouse. We can deliver clean data and we implemented a BI tool, Cognos, which is IBM's Cognos, which is IBM's enterprise tool. So now we're delivering real-time KBI's from our source transactional or source of record data, source of record ICRIs I in real-time to now our user base has grown to about 100 to 200 to 300 people are consuming now our product. So now we've been able to expand their coverage a little bit to other departments and one of the primary products that we delivered was a sales KBI, which we could put in the hands of all of our sales associates, which could track all of their transactional metrics, money receipt, business written, all of the different KBI's for the services with goals, with intraday goals, real-time to the second. Clearly an ROI, clearly a lift here, but this was just the beginning of our problems. So before the cork landed on the ground after we popped the champagne, we got an avalanche, avalanche of data requests coming from these other departments with these other tools from these other sources like the Salesforce and the high jump and some of the operation systems and a lot of files, flat files sitting on our local drives that everybody wanted now a piece of this data of this delivery. So what we did was at the time was we leaned into our engineering, our engineering backgrounds and we just started building ETL scripts. We spun up a note server on Linux and we started building scripts, scheduling them on crime, to do all of this ETL process from the Mongo, from MySQL, from the Excel files, from Google spreadsheets, from every file that imaginable that they needed, writing them into the data warehouse, creating new DDL, creating new tables for this data, taking Mongo, non-relational, creating relational tables and we started doing this in a frenzied pace. We could not keep up with the demand. The demand was overwhelmed. We had like selling water in the desert basically. So at the time, our team now consists of, now we're in 2019 Q3, we have one enterprise architect, we have one IBM consultant, we have three dead engineers, we have four analysts. That's a jump from 2018 where we have one digital analyst. So we're already starting to grow but the demand is far outpacing our growth. So our enterprise architect, our CIO, engineering, we kind of went into the war room and decided we need to be another solution because if we're going to use software development resources, if we're going to use our data engineers to write software to do the ETL process or incorporate our software core teams, they're basically running the ICRIs I in our nuclear systems. We're going all the way back to that first slide where we were leaning on software engineering to handle our report problems. So we've basically gone in a big circle. We've abstracted out another layer but we essentially would be going back in a circle, back to software engineering, carrying the burden of the report. So this is when the idea of the virtualization layer came into the picture. Denoto came highly recommended. Everybody that we talked to, Gartner, a few other sources, the Denoto was the solution for the virtualization layer. Denoto was the product that's going to basically handle this centralization problem, the standardization problem, take the burden off the software engineering. And we thought this was perfect. This was perfect. So we ran a Denoto, basically a proof of concept. So Denoto allows you to do a proof of concept, which I highly would recommend anybody considering this tool, do the proof of concept, spin up a server, set it up, see if it can do what you need to do. So our enterprise architect went through the proof of concept process, plugging in some different data sources, doing some simple standardizations, a little bit of data modeling, and he found that this is absolutely perfect. This is exactly what we need. So we implement Denoto. Our architecture now in 2020 to 2021 looks like this. So now we have an enterprise architect. We have two data engineers. We have four data science data engineers. We call them data science engineers, where they take this kind of a hybrid between engineering and data science. We have seven analysts. Our users are now over 1,000. We're basically covering every sales associate, most of the critical departments, stakeholders, and even at the manager, senior manager level. I were delivering KPIs, reporting predictive forecasting to over 1,000 users. A big jump from 2018, we had about 20, 25 people who are consuming reports. Plugged into our Denoto layer, we have BigQuery. BigQuery kind of serves as our digital data warehouse. So we plug BigQuery into the virtualization layer. We plug our data warehouse directly into the virtualization. So our data warehouse is one of the sources of this centralized layer. And then all of our other databases, Mongo, MySQL, SQL, HiJump, all of the different systems, Google Spreadsheets, flat files on our F drives, which are our internal servers. Everything we can plug in relatively quickly, which would have taken from a software perspective. We had to manually write all these scripts and all these ETLs and maintain all of these hundreds of hours. We did this in a matter of days, creating all these connections. So this changed the entire game for us. This changed the entire game because we were moving at such a rapid pace. And then when the demand really hit us, once we went live with the real-time KPIs and that demand hit us, we started the bottleneck again. We started the bottleneck, we started the slowdown. Clearly, the solution wasn't software engineers hacking away, feverishly, on these scripts. Denoto basically unblocked us. Unblocked us and we were basically clear sailing and we're able to handle the demands. When the demand comes in, we're able to fairly quickly set up these data sources and create these standardized views, standardized layers. Getting into the nuts and bolts of Denoto of the way we built out this architecture. In Denoto, there's so many ways you can build out your directory and what they call virtual databases. We basically have a data warehouse source where we have all of our initial connections. And then we have a data warehouse illogical layer, a canonical model, so to speak. Where we create from the base views of that first layer, we create integration views. And from the integration views, we create interfaces. Think of an interface as your contract with a customer. So the integration view from the base view to the integration view is where you can do your conforming. Your derived columns, your different fields, any type of modeling you want to do from that initial base view, which would be the raw representation of the data from the source, to the integration view. And then the interface is simply a reflection over the contract from that integration view. So we expose only the interfaces to the customers, basically, to the analysts. From there, because Denoto has a very good optimization engine. So in the way that we're trying to utilize or tap into that optimization engine is to use other virtual databases for very specific things. So if we have a very specific report that we need to send out, or we have very specific cross-source views, complex roll-up type tables. We create another virtual database specifically for that report, and we build the models and run them there. And that gives Denoto's optimization engine a better shot at trying to determine patterns, query patterns on that specific report. And we'll do it for departments too. If there's very specific data to a department that no one else is going to need, or it's going to be unlikely that anyone from any other department is going to need, we'll also use a dedicated virtual database for a department. So we might have a supply chain virtual database. We might have a report specific virtual database. We might actually have API specific virtual databases. So what we're weaponizing in Denoto. So the problem Denoto solved for us was that centralization problem. In other words, how do we centralize all this data? But unbeknownst to us at the time, its head has many more weapons under the hood. We're on Denoto 8, so we're running Denoto 8, so we're on the latest and greatest version of Denoto. It's got a beautiful web client, so you can do a lot of work right from your web, you don't need to shell into the server. So the REST API that Denoto offers is absolutely fantastic. And I feel like it's underutilized, and Denoto didn't even really sell this part to us. They kind of, this was something that I believe it should be hard sold, because we're using our REST API for multiple different things. So one of the things we can do with the REST API is our stakeholders love Excel. They love using Excel. Most non-technical people do know how to use Excel when you're in an organization. Excel is something that you'll find that people are comfortable with. They know how to use it. They might even know how to do a few functions with it, pivot tables. So what we can do with the REST APIs is we can build views, queries, complex queries, combining different data sources, schedule them to run, let's say every 15 minutes, and then deploy that into a REST API and deliver it as JSON. That REST API can be integrated into an Excel sheet. Excel can call, make get requests. So you can take that REST endpoint, plug it into Excel, and then they can, every time they open Excel or they want to refresh it, they're basically going to call their API and get their view in the Excel sheet, and then they can do their pivot table and all that stuff. It's extremely popular with our supply chain stakeholders and operation stakeholders. The other thing we're doing with the REST API is for the software teams. So the software teams still do some things that aren't core technology related or transactional related or point of sale related. They do a little bit, they do some customer follow-ups. They do some things where they do have their own reports and own metrics. But to do that, they need to hit the production system. So they need to hit the ICRIs I to pull any data. Obviously, that's not ideal to put more pressure on the production system if it's not point of sale related or transactional related. So this allows us to create REST APIs for the software teams to use to pull the same data that they normally would be pulling by executing SQL strings against production. So instead of executing these crazy SQL strings in their modules against a production source, they can simply call the REST API, get the JSON payload delivered to them, and then do what they need to do with it. So it takes pressure off the production system and makes life easier for the software teams. The third thing we're going to do with the REST APIs is Jira. I don't know if anybody uses Jira or if anybody. That management tool, very common in probably most companies, Agile, all that kind of thing. Jira has an API. So we have product owners and product managers that use Jira that pull data from Jira, create these reports from Jira. We're going to start plugging in Jira API into Denoto and creating models of the Jira data which our managers are going to be able to easily consume. So we're going to actually make management life easier by using Jira's REST API into Denoto and delivering it as a REST API that they can then consume in Excel. So REST API connecting to Denoto, Southside delivering it back as a REST API to the managers into their Excel sheets and they can manage the data like that. Data catalog is pretty straightforward. You control your metadata. You can control all your data sources. Your source to targets is an excellent tool. You can really enrich your data. I feel like this is an area that we can improve on. I think we're using it, but we can really be using this more. I've seen some of the things that other companies are doing with their metadata and it's fantastic. Scheduler, pretty straightforward. You schedule your jobs. This thing is fantastic. You can set up anything you want to set up to run. You want to build your complex views on a 15 minute interval. You want to plug in different APIs and call them on different intervals. It's got all the tools you would expect it would have. How many times you want to retry all your error logging, anything you need to schedule any type of jobs. It's a fantastic slick tool that's got very, very good monitoring, easy access. VQ, is there a SQL language or a query language? Or basically, it's it's in the nodal, which is excellent. It has a ton of functions, a ton of things you can do, very similar obviously to other query languages. And things like window functions are more complex type of things like that. You can delegate, push those down to the database and still execute those types of things. GraphQL is our latest find. We didn't even know what the nodal A had this. The nodal A has GraphQL out of the box. We just discovered it a month ago. Absolutely fantastic. Very excited about this. Software teams have GraphQL and some of their applications. So we're in development on some of the GraphQL stuff to try to make some of our API calls cleaner. Also, it allows you to aggregate cross data sources through GraphQL and allows you a little bit more flexibility to the caller. So we're starting to develop out the GraphQL API system. And I think the software team, they're really gonna lean into the GraphQL and that'll save us some development time on the REST APIs where I think they can do a lot of the GraphQL query. So this is basically the five main components that we're weaponizing. And as far as data science go, they didn't talk a lot about data science, but data science has been extremely, extremely fantastic with this tool. So we're able to centralize all the data. So what used to take us months to wrangle data from say marketing, or it would take two, three months to wrangle data from all the different sessions, data to spend data from all the different sources to get a data frame together before you can even start to look at your optimization spends and your curves. We can do this now in seconds. You can run correlation matrices on data sources, different data sources that would have taken us weeks to wrangle together. We can do it in minutes now. So say you wanna determine, you have a turnover problem and you wanna determine how back orders affect turnover, which means you need supply chain data, which means you need vessel on water data, which would have taken us API calls and this vendor and that vendor to get some vessel data together to then get some turnover data together to then wrangle all this stuff together to create the data frame and all to start to even run a single correlation matrix. We can do this now in minutes with the centralized layer. So this has been a game changer tool for us. It's been absolutely fantastic and it's allowed us to grow the team to from the single data analysts to what you see here, which we have a team of over 20 people here. So that's my story. That's the city furniture data story. So I will now leave it over to Anessa to get into a little bit more of the product details for Danoa. Thank you, Ryan. That was wonderful. So now what I would like to do is go through some of the features of the DINORO platform for data virtualization. But before we go into the actual platform and start talking about data fabric and all of the wonderful systems in the ecosystem, let's understand the actual business challenges. As you heard, city furniture had multiple challenges during the architecture. And the first one was trying to implement a data warehouse. So what is the business looking for? The business is looking for the flexible platform. The flexible platform is going to enable you to take today's solution and grow it in three to five years as your use cases change. Also, we heard a lot about API. API layer and the whole data service concept is critical as it's going to enable you to expose information to more consumers. Obviously, all of us want to reduce the implementation cost and make them. You want to make sure that you have a small team capable of maintaining your pipelines, your data integration, data management solutions, not only today, but also as you expand into tomorrow. So you've got to think about the future. And as you make data accessible to more users, you need to make sure that the security and the governance comes into picture. So that is that. Let's look at a couple of the things. Right, talking about APIs. So I would like to take it a little bit deeper and talk about data services. There are a couple of use cases we see popping up for the data services, RESTful APIs, GraphQL. Essentially, first we get access to the data at real time with no data replication. Moving data is expensive and also it creates silos and duplicate data. You want to avoid it. So you want to make sure the data is accessible using data services. Also, you want to have a consolidated view into your data. Each customer information in multiple data sources, you want to ensure that you can combine it and you can give a unified view to your consumer. After all, depending on the use case, the consumer may need different flavor of the data. As well as having the secure governance platform, make sure that if Ryan can see certain information which I'm not allowed to see, I'm not going to have access to the data. So it's all about accessibility and it's all about exposing data to more consumers. So let's look at the three specific use cases we have is data services. So first would be if you have an existing API gateway for instance, and you're already exposing microservices, you can easily plug in DINOTA platform into your ecosystem. You can say that you're going to use DINOTA today to virtualize certain data assets, combine, federate across multiple data sources and expose that information as APIs to the consuming applications. Very easy to do and very straightforward. And again, it fits into your existing ecosystem. If tomorrow you need to do something more complex, what you can do is you can build on top of that. You can say that, you know what? My existing microservices are accessing data from different data sources. But what if the data is going to move tomorrow? How can I handle the use case? And this is where data virtualization comes in again. What you can do is instead of your existing microservices, accessing data live from your data sources, you can virtualize that data and expose that information to your existing microservices. So if tomorrow your data moves from Oracle to Snowflake, your microservices are not changing. Another use case is similar but different. What if you would like to expose different data from different divisions to your microservices? You don't really need to federate but you want to provide an abstraction between your data and microservices. So you can easily have smaller denoted instances, deploy local to your divisions or to your acquired data sources and then you can expose that information to the microservices. So again, it's all about having the flexibility to start implementation now and yet grow it for your future. But it's not only about data services. It's much more than data services. We got to look into the whole ecosystem. And since we are talking about data fabric as one of the common terms in Gartner, Forrester and so on, we have to address it and we have to get a clear understanding how data virtualization fits into a data fabric. But before we go there, we have to understand how is our ecosystem and different use cases fit into different products we have? We have ETLs, ESPs, many, many other interfaces. So let's look at the logical data warehouse. What questions can we ask? Sometimes we know the question that you're asking like get me the customer information. Sometimes we don't, we're trying to generate insights. Sometimes we know the data we're working with. Sometimes we don't even know what data we have. So there are different approaches to work with the data. First would be data warehouse. Data warehouse enables you to answer all sorts of questions, but you know the data. You know where the data is coming from. You created the data warehouse. Data lakes are different. You are creating data lakes, but some data that you get there might be unstructured, it might be hard to work with. It might be in a form that you're not expecting. So data lakes are great working with known and unknown data. Data science is their specific breed and they work with unknown questions. They work with unknown data set. They need to use any data you have in our organization and start and generate insight. They may know the questions, they may not. And most likely they don't know what data they have access to. Operational intelligence is easy, it's, you know, everything. What we're showing here is that there is no single data management approach that addresses all the use cases. And this is where the logical data warehouse comes in. Logical data warehouse, it's all about realizing that different integration techniques and different products, different solutions that you have in your ecosystem are complementing each other. They can work together. It's not about saying that, well, Data Lake is going to solve all of my issues. That's not the case. It's not saying that virtualizing all of my data is going to solve all the issues. Most likely that's not also the case. There is still a need for retail jobs. There is still a need for ESBs in your ecosystem. But it's ensuring that all of those substructures, all of those products can work together. So you need to go with the product which has the flexibility to solve your issues today and also enable you to grow tomorrow. And when we look at the Gartner, this is exactly what they're talking about. They're talking about logical data warehouse which essentially leads us into a data fabric. Data fabric, it's all about having this centralite access layer to all of your data, regardless of the format, regardless of how it's being exposed. And it's about getting the right data to the right consumer. If you look into this data fabric, there is no single solution which will implement the whole spectrum. But as we'll drill down and I'll show to you, Gynoto fulfills most of the requirements of the data fabric. Data fabric is all about accessibility to the data as well as making data consumable to the business users. It's all about getting the data to the right consumer in the right shape and form, as well as utilizing AI internally and driving the insights be that for the data scientists or for the data discovery. So the six main pillars of the data fabric are data catalog. You need to see what data you have in your ecosystem. The data has to be accessible. Also, we're talking about the enriched semantics. It's not knowledge graphs in terms of having a graph database. We are talking about the semantics model where you can build up your business domain to take a terminology of your data from your Oracle Snowflake, your Excel documents, your APIs and start building a semantics model on top of that. So when you expose data to the consumer, you're exposing account, customer, patient, peer, you're exposing data in the form the consumer can understand. Also, since we are working with metadata, we can use that information to start building insight. For instance, if today I'm accessing my customer information, I might also be interested in customer retention information. So since in a data fabric, you have access to all of the data, you can start generating active metadata which is the information about usage of your data and you can use that to provide recommendations. It's like Netflix. If today you're watching a specific movie, tomorrow you might get recommendation for something relevant to that. Same as data. If today I'm accessing customer information, I want to get relevant information. And since some of the data is unknown to me, instead of me going and discover the data, the product should tell me where to look. So now let's look a little bit deeper into the data virtualization. What does DeNoro offer? DeNoro is an abstraction layer between consuming applications and data sources. DeNoro enables you to connect to pretty much any data. Structured, unstructured, data lakes, cloud, SaaS applications, files, pretty much any data. So you can easily join your DB2 data with some streaming data coming from some RESTful API from social media. Since you have that connectivity, you can do a couple of things. First, you provide one of the pillars of the data fabric. It's all about the centralized connectivity to any data. But also you can start building on top of that. DeNoro offers a semantics layer where you start doing modeling of your data. Again, all in line with the logical data fabric. I apologize. And then you can start accessing more information. Since we are working with metadata and DeNoro is a metadata-driven product, we are not moving the data. We are accessing data at the source. That's why we have to access it with Kyler Optimized Query Engine. So we have multiple things that we optimized for ensuring that you get access to the data. But essentially, since we have that information about data itself, and we also have information how users consume the data, we can tie that data together, we can tie that active metadata together, and we can provide recommendations. Deployment-wise, DeNoro can be deployed anywhere on-prem, hybrid, multi-cloud environment you need. And that's going to enable you to grow your implementation from small projects today to larger implementations tomorrow. So data visualization use cases actually apply across different platforms. For the data consumption layer, it's all about getting the data to the correct user. And actually, let me go back a slide again. For the consumers, they also expose multiple ways that consuming applications can access the data. As Ryan was mentioning, they're exposing information from DeNoro using RESTful APIs or GraphQL. But also we can expose information using a standard JDBC or DBC bridge so you can have any existing product, BI tool, data scientist tool, easily access the information from DeNoro. And also we have our own data catalog. Our data catalog, again, is something that enables a business user or a data scientist go to the centralized web-based portal and start searching for data. So it's not only about decoupling the technology, implementation of your data itself from the consumer, but it's also driving the self-service culture, making data accessible. So again, connecting to any data sources, optimizing the access using our query optimization engine, as well as providing the semantics model on top of the, to make data understandable and consumable. And that all fulfills the data fabric. Going back to the use cases, it's all about different consumers and different use cases. So for the consumption, for instance, you can have your data scientist. The data scientist has to discover data. And in your ecosystem, if they don't know what to look for or where to look for it, they need the ability to go to a centralized page, a portal, a data catalog and access that information, be able to search for it. And this is where the another data catalog comes in. Another aspect of the data governance is that if you have multiple tools connecting to the data directly, you have to implement security and governance multiple times, depending on the tool. And if the data moves tomorrow from your, again, Oracle to Snowflake, or you have to acquire new data and you have to enrich your data sources with additional public information. Again, all of your consumers have to change. If you have that abstraction layer between the consumer and the data source, DINORO handles all of those changes. So your consumers are not changing. The security model is implemented once within the DINORO. You can control who has access to what data as well as you have the full audit trail. You can see how the data is being used. And you can obviously apply masking rules such as column and row based security. We are working with live data, so we have to ensure the data is fully gotten. Another key information here and the use case which I see frequently is single view of customer, student, whatever you're working with. I was working with one of the universities and if you go, one of the use cases was if you ask the same question such as give me a list of all the registered students and you ask that question to different departments, you're going to get different response. For instance, Bursar department might say that they're going to give you a list of all the registered students who paid the registration fee. If you ask the same question to a professor, they're going to give you a list of students who are actually attending the class. So you can see if you don't have the centralized access layer, the data being presented to your consuming applications may differ. So they are going to work with different data sets. And if you work with different data sets, you're going to generate different insights. So when you present those different insights to your management, there are going to be many questions. So having the centralized layer again gives you the single entry point for all the consumers. And since they're going through the same semantics model, if you ask for customer, you're getting the customer as it's defined in the semantics model. Again, working these different projects in a data management, migration comes to mind. Companies are moving into the cloud. And as you start moving data into the cloud, most likely you're going to end up with the situation where some data is going to stay in place on-prem and some data is going to be in the cloud. But you don't want the consumers to know where the data is coming from. So you want to have that abstraction. And also if you need to divide the data and some data is in your Azure ecosystem and some data is in AWS, you need to make sure that you have a solution capable of accessing all of that data, uniting it, federating access and presenting it to the consumer. So is that some of the key benefits? And again, I'm tying everything to the data fabric because data fabric is something that we need to understand and we need to implement. If tomorrow you start working with the simple use case, tomorrow I guarantee the use case is going to change. The requirements are going to change. The business requirements, the business itself is going to start asking more questions. The more data you give them, the more questions are going to be asked. So Dynoto drives that. Dynoto enables you to make your company data literate. It enables the users to understand the data because it's being presented in a unified fashion. They don't want to work with JSON. They don't want to work with structured unstructured data. They want to see a simple representation. If I want to see customer information, I want to be able to access it using my product with the BI tools like a Tableau, using data science environment, or using my own application. And I want to have access to the data. Also, what it leads us to is data sharing. Since you no longer need to move data, your data can stay close to the business because business owns the data. And you can start sharing on your relevant information, on your relevant data views to the centralized Dynoto module. And then the centralized Dynoto module is going to expose it. The fact that you're decoupling that Southbound IT infrastructure from the Northbound business gives you the flexibility for the business to continue using their own tools while IT is going to continue innovating. They're going to start implementing new data sources. If tomorrow they have a new use case and they want to bring up MongoDB or any of the new data sources, they don't need to go to the business and tell them that, you know what? We gotta make a change and you gotta change 20 of your applications. Now, the only changes are happening are in Dynoto. Business is the couple. Business can move slow, continue using their own tools and IT can move quickly. Such as the buy product, you guessed cost reduction and you increase your return on investment. Again, you don't have to go with the big bank project. You don't have to go and implement the 32 core instance of Dynoto and virtualize 20 different data sources. You can start small, show the value and you can grow because Dynoto enables you to grow. It goes hand in hand with the data file. With that, take us for the test drive. Cloud, on-prem, our data, your data. Take a look at the platform. So, Ryan, I guess there were a couple of questions in the chat earlier. I know first one, it was talking about structured and unstructured data. So essentially Dynoto can access pretty much any data, as I mentioned before, in any format. We bring it into our semantics modeling tool and they present it to the user as a relational structure. So it might be an Excel or a JSON coming in, but when we start uniting the data, it's relational. So essentially from the consumption perspective, you get SQL on top of anything, which gives you lots of flexibility. Exactly. With Mongo, exactly what you said, Anessa. Yeah. And another question, sorry, Ryan, another question I see is a follow-up which is very related as performance. Yes, we are accessing data in real time. So when the query comes into Dynoto, we're going to do a couple of things. First, we're going to analyze the query and we're going to analyze the data sources where the query is going to go and execute the answer. We're going to rewrite the query, for instance, we can reorder join operations. And our goal is to push down query execution to the sources themselves and get on the relevant data bag into the Dynoto for the final match and merge capabilities. So we do query push down, we integrate with MPP engines, depending on the data sets. We have caching, we have smart query acceleration, summaries, and many, many things happening under the covers to make sure you get data access quickly. Ryan, anything in your experience for the performance? I mean, that's probably one of the biggest things you're going to face for the engineering effort is basically the real-time data. The first thing that I would say that we do is determine if we need the data real-time. So in other words, no matter what you do, you're going to create some type of performance hit on anything. It really affects everything. So we try to determine the data source and do we actually need to replicate this real-time. And if we don't, we can build views, we can schedule views and build models off peak times and determine if we do need a real-time, then we need to engineer in a way where we can do the minimal amount of performance hits. But that exact balance between determining your real-time performance hits, you can do the production. And not, it's something that we debate on a weekly basis. This is just an engineering problem. This is why it's something that when you're going to be talking about real-time data, it's going to be an aggressive approach and it's going to need to be thought out because it will affect, it will have some impact on your production if it's not done, if it's not done right. But the nodal, because it's a virtualized layer, the nodal seems to give us the ability to schedule or not schedule things. So we don't need everything in real-time. I would say of our data source, I would say 60%, we're replicating real-time, 40% is built on a weekly basis. And I think this is where it's critical, depending on the use cases, what you do now today, may differ than what you're going to do tomorrow. And you need the flexibility and the different options. There's no easy answer for that one. That's kind of specific to the company enterprise and your team and what you're trying to do with it. Yeah, and I think that's the point is that there is no golden product to solve all the issues in the data fabric. It all depends on the use case. There was another question about the semantics modeling. So we actually have a drag and drop tool where you can just simply drag and drop data sources, draw lines to connect them and build on top of that. Also, if you do have existing models implemented of all, for example, we can actually import those using our modeling bridge. You will see those models in genotum as interface views. And all you need to do now is score those existing models, those existing interface views from any virtualized data sources that you have. So as bi-directional, you can go bottom up, connect to data and start modeling. Or if you have existing models, you can import the model and go top-down design where you can start with the pre-existing model. Ryan, I think you have a question there. It says, do you really need data at real-time? Right, that's the question. That's exactly the point, is a lot of times you don't need it in real-time. So you don't want to do real-time data just for the sake of real-time data. It's not about bragging. It's about, it's just actually delivering any lifts for the company. And there are some cases where real-time data is going to deliver a lift. In our case, we have sales KPIs that every, so we have 1,000 sales associates. They all have their real-time sales metrics in their hand on their iPad, available through their iPad. And we have daily goals. So they have goals set per day, per metric, per key metric, which they look at all day long. They're making sales, they're looking at their numbers. It's clearly driving. It's clearly driving sales. In other words, it's almost like their, it's almost like a manager for each associate. So that's a case where real-time data has provided us a lift. It's improved associate performance, giving them that daily intraday goal against real-time data. But there's a lot of cases where, if you're talking about bonuses, you don't need to generate real-time bonus data. It's based off of historical week to day, year to day, whatever. So in many cases, you don't need real-time data, but there are cases that you do, and the key is to know which is which. And then because it's an engineering lift in any time you're doing real-time data, and the maintenance of it, and the expectation of it. Once you provide some real-time data, access to their metrics, they're going to expect it up to the millisecond every day, all day 24-7. So you need to maintain that. So that's a question that you need to answer based off of, is this going to provide me lift or not? Do I need to do this in real-time? And that's an engineering question. Or an engineering problem that you need to solve with a business question. So that's actually, it's like a great point. That's a great point. All right, well thank you, Ryan. I think that's our client for this session. Thank you, Ryan. Thank you, NSF for this representation, and thanks to our attendees for tuning in. Please complete the EDW conference session survey located at the bottom of this page. The next session will start in a few minutes. Thank you, Ryan. Thank you, NSF. Thank you. Thank you all. Thank you, thank you.