 Hello, and welcome. My name is Shannon Kemp, and I'm the Chief Digital Officer for Data Diversity. We want to thank you for joining the latest in the monthly webinar series, Data Architecture Strategies with Donna Burbank. Today, Donna will discuss business intelligence and data analytics and architected approach sponsored today by KatanaGraph. Just a couple of points to get us started. Due to the large number of people that attend these sessions, you will be muted during the webinar. For questions, we'll be collecting them by the Q&A panel, or if you'd like to tweet, we encourage you to share highlights or questions via Twitter using hashtag DA strategies. And if you'd like to chat with us or with each other, we certainly encourage you to do so. And just to note, the chat defaults are sent to just the panelists, but you may absolutely change that to network with everyone. And to open the chat or the Q&A panels, you'll find those icons in the bottom of your screen to enable those features. And as always, we will send a follow-up email within two business days, containing links to the slides and recording of the session and additional information requested throughout the webinar. Now let me turn it over to Abhi for a brief word from our sponsor, KatanaGraph. Abhi, hello and welcome. Oh, thank you very much, Shannon. I hope you are able to see my screen. Looks fine. Yeah. All right, wonderful. Hello, everybody. Welcome to the webinar today. I'm Abhi, Abhi Sheikh Mehta from KatanaGraph. I had the field engineering team. And I'll take a few minutes of yours to give you an introduction about KatanaGraph before I pass it on to Donna to talk about the purpose for which we are here today. About KatanaGraph, we are a graph intelligence platform. We are a technology built for high performance, scale out graph processing and analytics. Our company is a COVID child. We were set up as a business unit in March 2020. It may give you an impression that it's a very new company, but in reality, our co-founders Keisha Pingali and Chris Rothback have been professors at UT Austin where our engine was developed over a decade of research by various different DARPA funded projects. So our technology is very well proven out. And as soon as this business entity was formed, the top notch investors from the world, as you can see on your screen, came back and gave us a pretty big series A round. And that has led to many commercial engagement on the front where we are dealing with many Fortune 100 companies in financial services and healthcare side. If you want to know more about us, please visit our website. That's KatanaGraph.com. So here is a glimpse into our leadership team. It's a very well combined, very balanced view of technology, PhDs and the business acumen. And the industries what we serve are a pretty wide variety. We are seeing a lot of traction all across the industry of healthcare, life sciences, financial services, from very generic to very specific use cases of fraud detection, anti-money laundering, from capital markets areas to talking about wealth management. So various different variety of spaces where you're dealing with big data, interconnected data, KatanaGraph technology comes into play because it's a distributed craft processing platform. And why people are trusting Katana a lot these days is because where our technology has been proven out on billions of records or trillions of records and terabytes of data sets. We have unmatched performance of which is 10 to 100 times faster than some proven technologies and it's cloud agnostic you can run it on any cloud vendor and whatever your poison is we kind of serve it. And it has scaled up to 256 machine. So, all in all, I will take one more slide for you guys before I pass it on back to you Donna, what we do in the graph compute domain is we provide you all the basic capabilities back by open cypher programming in the language of graph database for curing capability, link depth analysis, multi hop shortest distance shortest path. You know, giving you a huge advantage over the RDBMS technology in terms of ad hoc analytics. And of course we have our biggest value comes in graph analytics and mining space where graph algorithms which be backed by our on demand partitioning processes and heterogeneous scalability can run many complicated graph processing algorithms in parallel. Along with that we are also seeing a lot of traction with the latest technology trends out there around graph neural networks. So if you have any such interest feel free to reach out to us, and I will pass it on to Donna to discuss more about intelligence and data analytics and see what kind of role graph can play and fit you know fit in there. Thank you very much. Thank you so much for kicking us off and thank you to coat Katana graph for sponsoring and helping to make these webinars happen. And if you have questions for Avi or about Katana graph you may submit your questions in the Q&A panel as he'll be joining us in the Q&A portion of the webinar at the end. Let me introduce the speaker of the monthly series, Donna Burbank. Donna is a recognized industry expert in information management with over 20 years of experience, helping organizations enrich their business opportunities through data and information. She currently is the managing director of global data strategy limited where she assists organizations around the globe and driving value from their data. And with that, let me give the floor to Donna to get her presentation started hello and welcome. Hello, I'm Donna Shannon. Always a pleasure to do these each month and always nice to see some familiar names in the chat. We have a very loyal following at day diversity. So, awesome group of people. So, yeah, without further ado, as I just sort of alluded to this isn't a monthly series so it's awesome new state names on the list, which is great. This is the first time joining either this series or a diversity. All of the previous recordings and that's always the first question people ask is that will this be recorded will we get the slides. Yes, yes. And then all, and I think and correct me from wrong Shannon all of the materials from data verse you're kept in perpetuity from not only this year or this series but other series on the past. Please take advantage of those and then also would love you to join us in some of the upcoming ones. Next month is a topic near and dear to my heart metadata management. And you'll see the other list of topics for the rest of the year so, but the reason we are talking today is bi analytics architecture and a little bit of intro there from the sponsor which is great. Because more and more organizations really are, I know it's right and we keep hearing that buzzword, but becoming data driven, and a big part of data driven is bi analytics, you know, some of the graph use cases that were mentioned in the intro. That's great. But where my soapbox comes is that only works well when you have a strong architecture behind it so analytics are great. Visualization tools and some of the great new vis you know the people can do is wonderful, but where, you know, I sort of nerd out in my realm is that doesn't work well and the numbers won't be right unless you have that strong or perform, unless you have that strong architecture behind it so we'll kind of go into that in this presentation. So, because we're data driven we always like to start with stats and data. So, each year. We have a little bit of strategy and the diversity kind of work together on a trends paper trends and data management, which I find super interesting because some themes sort of say stay consistent some change year over year some grow over time. The one that tends to say consistent over time is that reporting in analytics are really key drivers to a lot of what we're doing in data management so you can see that one in the middle of reporting analytics is a key driver. The fact that data is seen as a strategic asset is one of those in the category that we see that growing over and over year over year so probably no one in this call is surprised at that I think most of us in this call probably agree with that but I think the rest of the industry the the quote the business that we tend to say was he is open and really wanted to do more with data which then drives the need for analytics. What I was heartened to see, and I hope this one grows over time because even though it's bigger than 50% it wasn't well over 50% or I would have said that but that people specifically said that they see improved collaboration through a defined data architecture and really well we want to get reporting analytics and being data driven that's all about collaboration and getting the numbers right and all of that so glad that folks called that out specifically, but would love to see that kind of grow over time. Also from the survey that hopefully folks will find interesting is you know this is when we asked what are the main goals and drivers for this was data management more broadly but just think data what are you trying to do is data. And really, again reporting analytics is at the top continues at the top and probably will keep growing that said there's a lot of other use cases for data operational data and etc etc but topic of this presentation and top of mind for many, many folks across the industry is and continues to be reporting and analytics and I see those as separate which could be a whole other discussion but they are kind of distinct things. If anyone joined our webinar last month, we kind of use this example last month was on data literacy and kind of the need for data literacy and when you read a dashboard. A big drive of data literacy is well firstly are you even using a dashboard, are we a data driven culture that doesn't go on to gut feel, you know on the sales or I really want to understand my customer and how trends over time are going not the answer of I know my customers I've been doing this for 20 years, which may be true, but we can also learn from the data so you know that's a big part of are we even using dashboards and the next question is, you know, are these numbers right and do we have the right data quality behind it and we have the right governance support. So, kind of a little mnemonic we had in the last session was this allow kind of asking you know what what about the data, the dashboards pretty the visit actually this one isn't pretty. But the visualization might be nice. But what about the data. So, kind of want to go across that theme, you know last month we sort of hinted at this, this session, which will delve into that last bullet at the bottom which is the architecture so and anything in data management has crossover. Often it's hard to get the quality right or the governance right if you don't have that right, right, correct architecture, but also just things like performance or, you know, are we using the right tool for the right job and are we thinking of the architecture so you know, we, I do this for living on a company called global data strategy and a lot of the anecdotes from these presentations going to come from real life. And a good trend that I am seeing is that more and more companies are becoming data, becoming data driven you know I've been doing this for longer than I want to admit and you know in the old days, folks have stayed with data management. And all you have worked and many of us have worked in government or financial services, you know, some of the folks that have been doing data management the longest, what I find fun about my job is that you know I've worked with small museums and nonprofits and you know, companies that you typically wouldn't maybe have been as data driven in the past but now everyone is. So that's positive. I do see though, not every, where a lot of folks can sort of go in and it's fun to build the visualizations and the dashboards and the tools have come a long way. It's a little harder to understand that architecture behind it and what's the best solution out there and it does not help that there are a lot of solutions out there and to tease the vendors a little bit. But you know a lot of folks have a vested interest in saying well our solution is the best and the speaker did not say that today. But you know there's a lot of, you know, I don't say it goes as far as misinformation but you know a lot of vested information in our solution is best right. So that is an exciting time to be in data management. Because there's a lot of options and a lot of choices but with choices also becomes the little bit of stress and risk right there's a cacophony of options I would say out there, in terms of what you can choose. You know, is it the good old fashioned data warehouse, and I will kind of give the punchline already yes that still exists and is still very valid it is not old fashioned. I've just tried and true and tested and there's a difference right is it the only tool in the toolkit absolutely not right. What about the data lake is that I'm old enough that that was a hot trend and I was a fat and now you know it was an old school thing that is a data swamp right or now do we have a data lake in our house, or a data hub, maybe data warehouse and data lake and data lake goes we just do data hub what's the data hub isn't that is that the same thing as you know, oops, sorry, an MDM hub, a data mesh. That's the new thing we do data mesh no wait data fabric data clothing data is not the same as data virtualization. I think no we just need a data catalog to put all our data in one place. You know, we don't have a data catalog I think you mean metadata catalog well no maybe it's a data marketplace, or let's put it all in a knowledge graph that sounds great or is it relational non relational stars give us equal. You can just, you know, that might be how your brain is feeling trying to keep up with things and I do what I do for like, even I've been doing this for years and years and years. I've been confused and I just missed something like, you know, you know, I, you know, I kind of look around the house and just say is it a data kitchen wait no that seems already come up with that is a data table like swear yep that one's been used a day to lamp, you know, just look around the room. And I always say with that, and many of us. As a, I see a note that I might be fading and coming back is the sound okay. You're walking away from the mic a little bit, but I am not. Yeah. Okay, well keep going. Um, so I always go back to first principles and I think you know universities and schools that are teaching this that's the best way to think this what is the right tool for the right job. So, this isn't the only way to look at things but it is a helpful point is data lamp is that a light bulb moment love it. What are people using in the real world doesn't mean that people are always right, but it does give a good indicator of what amongst all the things out there what is has been tried and true. And I think we've been doing this survey for gosh I don't even remember so it's probably 10 years now data warehousing and business intelligence generally are and continue to be at the top of things that people are integrating and this isn't necessarily all around reporting there's other areas, you know that are all overlapping and that could be a whole conversation of what's the difference between governance and architecture and modeling etc but you know there are other other things to explore but nothing is wrong with building a data warehouse and bi and a lot of the the organizations that are doing some of the more, you know, modern or cutting edge, you know data science AI graph patterns that were mentioned. Also do that in conjunction with a data warehouse doesn't mean one is better than the other and you'll hear me continue to rant about that. They're just different use cases and different tools in the toolbox. I found this sort of interesting as part of that survey, you know are you using a data lake, not as many as you would think but many who are using it in conjunction with the data warehouse which makes sense right. I'm trying to understand total revenue by region by sales rep. That's a great use case for a data warehouse and I probably want those numbers to be right. You know if I'm trying to do some exploratory use cases and do some social media analytics or you know real time streaming of product sensor data that's a great use for a data lake could those be combined for some really great insights, of course. So, great way to think of that. When one is building an architecture and I highly recommend you do when write it down even if it's just a whiteboard of you know how do these things fit together. And they do often fit together with a nice, you know, kind of zoned approach of, you know what, and I kind of. This is a big fictional but based on several real world patterns, you know what what is kind of broadly called those enterprise systems record right things like master and reference data you're good old fashioned data warehouse. And a data mark was the difference between those that could be a whole other session right, but you know the fact that you're doing data science and advanced analytics maybe or some data discovery off the lightly modeled data or some exploratory data is only augmented by things like master and reference data right I would think if I were a data scientist. It would be better to have clean data to be working with, you know, plenty of surveys out there that, you know, unfortunately, a large part of the data scientist day is spent cleaning data, rather than, you know, actually doing the analytics on it so again that box in the green only helps everyone but that's a whole different way of modeling and managing and governing the data than maybe some more exploratory analytics. So at the top of that, whether it's, you know, standard bi reports or visualizations or self service bi really, you know, can be several patterns behind it one of which maybe a warehouse governance should go across all of those layers, maybe maybe it's more lightly governed on the left. Maybe not maybe maybe you have some really, you know, sensitive health data coming from some of the stuff in the lake doesn't mean just because it's in the lake it's sort of open open season. So security and your privacy and your PII is is still very important. One of the stories I like to tell in the person and company is nameless but we were building a model like this and we were talking about the PII aspect and you know personally there's a lot of valuable information that should be obviously secured in this case it was sort of health information as well as credit card information. And we're going through this model and one of the younger gentlemen on the team raises hand these things so I shouldn't be putting the credit card data out in the, you know, sandbox in the cloud for the exploration. The boss basically said send Jeep we're going to. Oh, I guess I didn't really name was today. Person X will talk after work. Yeah, you can't, you know, just sort of take customer data and just dump it out somewhere to do some playing around with. So, you know, the fact that the security and privacy governance to go across all the layers regardless of the technology. So if you've been on my my sessions you've seen this slide before. This is kind of our framework that we use it global data strategy but it does sort of speak to the fact that, you know, bi analytics you'll see they're kind of right there in the center. But there's a lot of things around that that that make that same. So whether it's governance or even having your overall strategy of where where analytics fits in. You know, self service, is it all enterprise is it, etc, etc, what's the quality of that and of course the architecture behind that is a big piece of it. So I did want to kind of get into that idea of the design aspect of data architecture so some of the things we talked about before or maybe, you know, platform or hardware or styles. I think some of it is the business kind of layer of your bi in your analytics and that's often kind of what's this sort of overlooked or I won't say forgotten but maybe not focused on as much. Again, if you've joined these you fold disclosure I'm a big fan of data modeling at all levels I think it, you know, really is a proactive way to understand things. And yes I have a kind of inventory of data modeling cartoons out there. Here's one and maybe it's not funny, if you haven't done this and maybe it's just not funny, but you know we've all probably been here hey we've built the applications we're done with testing we've got this great new marketing application just wondering what do we mean by a customer. And maybe that's not funny until you've gone through it but there's a lot of different areas of customer what is even a customer, you know, and I have worked for been a member of large corporations that have made very embarrassing mistakes. We work with one where they set out kind of renewal notices to people who are prospects. You know, so when you talk to a sales person I'm going to go talk to a customer. Generally, often they mean a non customer someone they're trying to sell to but kind of colloquially, we use the word customer. In this case they literally went to the kind of, you know, pre sales database and use that because the database literally was called customer, not a great way to kind of do your data governance. You know, a lot of different flavors of mistakes about that or even, you know, is it a lapsed customer does that person on maintenance today. It's a customer of different account types is it a premium customer etc etc so anyone in data governance or data management data architecture knows that that should be a big part of that and often and I don't I see less of this but maybe again we have a filtered audience because we work with the data models, but there has been kind of in the past let's just not do the model that's so we unwieldy and it's going to take so long it's just a lot faster to just start building it will kind of do modeling later. But I guess I'm my mom used to always say if you don't have time to do it right you have time to do it again. And you will answer that question of what is a customer or what is a patient or what is a student or etc was a citizen. It's just, do you want to do it when you're fixing it or do you want to do it up front. And, you know, one of the comments from Gail you know every CRM system has that idea of a customer a contact a prospect. And, you know, I ran it over Twitter the other day, I was actually reviewing you know I model and it literally had a party related to a thing. And I said, yeah, that's very reusable. But that tells me nothing about this company one of the things I love about data modeling is you can tell almost everything about a company by kind of a one page data model. And yes, maybe behind an ERP system or CRM system there, there is a party table or an account table, but how that's filtered what flags you use what account types really make a customer. Literally just on a call right before this going through that what account types in in one of these big CRMs, and how do people use it and how is it entered, you know, as a big part of that so to me this is that business level modeling that absolutely cannot be skipped. But there's levels of data models there's that that business level whether it's up at the enterprise subject areas. The subject areas or down to kind of that business level with the conceptual model of that sort of both of those are sort of where is what is the customer. What what is a product working with the company right now with full seriousness and no irony of that says, we don't know what a product is in our company, we're successful, we're a multi billion dollar company we're making a lot of money. So if my management asked me to tell me revenue by product we can't do that we don't know what our products are. And again that's your core of the business that modeling that reveals a lot of interesting things across the data and across the business processes as well. Once you get down to the logical, that's still a business centric view, but you are, and that could be a whole debate and webinar of how physical is a logical. Does it need a relational database behind it, or is it kind of business rule centric. Again, that could be a full webinar, but it really should be at the business level of using business terms, using business logic, and getting those business rules around the data, and both of those are really related to understanding the business so that that is a sort of a definite when you're trying to build those reports why does this matter. We'll talk about this more in the presentation give please give me a report showing products by customer by region what's a product with the customer and then I will now want to say what are total sales. You know how do we calculate total sale all of that needs to really be defined. And then of course the physical model and I will get into this, because I've talked a lot at the business level but again years and years of this physical stuff is as hard and tech is super fun, but most of the problems we run into that we need to fix our that business level it's a business rule that wasn't described yes this performance issues and things like that from having a badly designed database but generally is the business rules that are going to cause your big problems. So getting that right up front, super super important beginning with the end in mind was one of the comments there. So, but yeah, but the physical model and again not everything needs to be in a relational database right and we have a lot more options a graph model right that's, you know, was talked about in the beginning. There's key value pairs etc but again, and I pick on the vendors pick on people human nature right there's always that well we have, we have no sequel now we don't need sequel that's just silly, you know we need both. They both have their place and just you know, don't fall into that trap but do think very carefully about that. There isn't only one option for your physical data. So the customer had a long day already say another customer where that whole idea of they had customer defined they were very far along, but then what are the different physical instantiations of customer in those different use cases yes they had graph and yes they had relational third normal form, and they also had star schema and they also had operational data and they also were starting to build MDM, and all of them were correct. How do you get that architecture together so I have really strong feeling that they will be successful because they are asking there's some hard questions in there where's the overlap but they're asking them. And at the get go, before they get too much further so good to do that. So, as I kind of mentioned there's physical different physical models for different use cases and a lot of tools in the toolkit don't just use the hammer, not everything is a nail. So there's another data modeling cartoon you thought you were done. And maybe that's not funny either but getting into third normal form. Okay, so if anyone doesn't know what that is that that's a great way to increase your data quality and ensure consistency or thinking of acid transactions, these are great when you do have an operational database I want to make sure that my records match up and I'm using referential integrity, great use for your relational database and that's one of the reasons they are still valuable and it's still well used. I would say when you use them use them correctly I mean the number of databases I look at when we come in and the don't have keys or referential integrity to find it and sometimes that's a design choice, but often it's I think folks just don't have those fundamentals of what a relational database was meant for and how to use that wisely. I'm like and I'm guilty of this using an Excel spreadsheet for just, you know, a list of my, my friends or their phone number there that's not really, you know, it's a financial modeling tool, and I'm using it for something else. And then I complain that the formatting doesn't work well with Excel and I can't, you know, in dead things and like because you're using a tool for not what is what it's not meant to be. I thought we all do it. Star schema and you know is that, is that still old fashioned we need to have in a webinar earlier this year, you know he's the star scheme of your dad. No, I mean there's, there's, there's part of the reason one does the star scheme is for performance are the systems built nowadays and the hardware and, and systems a lot faster that you don't need to necessarily put it in the star to get performance yes, but is there just a human friendly, logical way of organizing data that you have a fact like total sales by region by product by campaign by. Yes. And so that ability to slice and dice and have a nice semantic layer is still still in use and still super super helpful. There's other options no sequel. You know, I do want that faster truth on building a website and I want you know high vape data ball ball volumes I do want to have a more flexible pattern for change. Great, great choice for that. And that's just the beginning is hierarchies is XML this graph this good old fashioned cobalt copy books that we still run into, and we can laugh at that. But they work. We think of that why are we still using the mainframe. And I think it and you can't use the mainframe that's old fat. I'm not, I'm not proposing necessarily to start with a new mainframe as your new application with all the options out now. But when people sort of roll their eyes and say this thing was built in the 60s or whatever. And I think it's still running and still working. So we can't necessarily pull through those designs. But again, there's a lot more options, you know, S3 three buckets would I say do that for your enterprise data warehouse know, but is it a great way to store information at low cost, yes, data vault, you know, great way to store data kind of as an initial kind of storage, etc, etc. Lots of different options. And my rant that actually is not over. I lied to said rant over there in the lower right will continue ranting that no modeling technique is inherently better than another it's sort of like saying a screwdriver is better than a hammer. Both are good they just have different use cases it's only a problem when that old fast and saying when you have a hammer everything looks like a nail and don't do that. There you go there's the webinar don't do that. So, the question I just asked is the poor old good old fashioned star schema dead. I already gave the answer is not dead. A dead was that money Python skid. Not dead yet. Anyway, maybe I've had too much coffee today. It still is a very user friendly and performant way to do that classic slicing and dicing so folks aren't familiar with the star schema. It sort of can theoretically feel like it's being shaped like a star. That's the name. So at the center are sort of your facts or your measures that those I always like to think of it. What are you reporting on, you know, I'm reporting on sales transaction or patient visits or things you're basically counting. So that's what you're reporting on. And then the things you're reporting by a general your dimensions, you know, and I want to report by customer by product by, you know, so that enough times in this webinar. Often, there's this term of, you know, conforms dimension where that's often your master data or your reference data. I'm surprised that some of these we talked already that single view of customer that single view of product are your master data efforts and so the more you clean that up. The whole topic of this webinar is an architect approach to reporting analytics. I don't fight them after the webinar. There's any report, you know, data scientist bi analytics person that doesn't want good nice clean master data that they can work with do they want to build it maybe not that's a whole discipline. But if someone said you want to clean de duplicated, augmented with with industry perfect data of all your customers. Yeah, send along. And so that's this idea of certainly old fashioned it's even more relevant with these new technologies. I want to build a graph database and see all my connections between customers. If I know the right customer, it's not going to work so well we actually had a really advanced financial services customer who is doing you know graph patterns and a lot of advanced analytics on their on their customers and had a great team doing that. The problem was, when they tried to integrate that with their customer list. They didn't have good master data so they didn't know if Joe Smith was a multi billion dollar and high net worth customer or Joe Smith was the Joe Smith who was the bankruptcy, you know, so people don't always wait and wait for it. You don't always have to wait for it. I mean there's a, there is a case for, you know, let's do just enough for now let's do some graph patterns on the on the data to get some exploration as we build master data you can't realistically stop the whole company, trying to sign that to management you know we need to stop operations for the next six months while we get our master data right you know that's you're literally you know changing the wings and a moving plane, which makes master is sometimes hard. But yeah, it does have to be done the more you can do that upfront, then you don't have to do as much cleanup. So yes, it is still a good, a good tool to use it's just not the only tool in the toolbox, and another good old fashioned well used tool that business users love I, I sort of stopped using this a bit my own practice full disclosure and brought it back to the bus matrix, which is a nice business centric tool as you're defining your bi and your analytics. And I walked into a meeting. I get a lot of characters from my clients and one of them is a, he was a senior financial analyst and we walk in the room he goes data people, bus matrix, bus matrix, and he was sort of getting annoying and sort of a funny way. So we started using the bus matrix and then we sort of add that in almost every, every project now because it is just so helpful, you know, again, what are we reporting by in this different flavors of bus matrices. One of the ones we, what are reporting by how do we define, and this could go into your glossary, or your data catalog or your metadata catalog or your data lamp. You know, we're reporting by total sales and wholesale revenue and what's the difference, etc. So that's what we're reporting by, and then you order reporting on, you know, I want to know revenue by location by sales by product and maybe and maybe another report just was location by product for the wholesale stuff. So a great way to map out a nice easy to you. Generally, we have a little prettier one than this but nice intuitive way for business people to understand, and a great way for a roadmap right because this this very simple grid, I just picked on spreadsheets, but this is fine as spreadsheet there's probably other tools out there. You know, it's a it's a glossary exercise and how do we meet by total sales revenue it's it's a governance exercise it's a master in reference data exercise it's a data warehouse exercise right this one simple graph can really be a roadmap for moving forward of how you how you build things out. So, big fan of the good old bus matrix bring it back. But there are many design patterns sort of behind the scenes so again I consider something like a bus matrix requirements doc, which sort of leads itself to the star schema but you know you might have come very different things we want website clicks by minute across the website probably wouldn't use a star schema set so if the stuff across the left is, you know, website clicks and, and across the right is by second, you know maybe this doesn't lead itself to a star schema warehouse maybe, but there are other tools out again, if you haven't heard me say it 16 times this webinar. There's a lot of options no no one size fits all. So even within the tried and true warehouse, still arguments over, you know, is it in men is it Kimball is it relational is it star schema. It's a nice case for both right a nice use case for the kind of relational model is getting that, you know, the quality and the single view of things. Like, you know, single view of customer and things like that when I didn't want to report on it and either an enterprise warehouse or a mart. The star schema is great for slicing and dicing. So, you can both go back right a lot of tools data vault is kind of growing in prep popularity especially in Europe. It is a flexible way to store the data. As you're discovering it. Is it as easy for reporting maybe not. So again, there's good tools for each, you know that the column data, a kind of flipping flipping the columns and the rows for performance is another great tool in the toolbox that might be a great answer to the website traffic that you're looking at, or just flatten everything out right I why do we need everything in third normal form as a data modeler. My answer is you don't. Right, maybe you do and again these can all work together maybe you do have in third normal form to rationalize it get your master data, and then flatten it for the data science team because I would love a nice clean list of all my customer data flattened out so I can do analytics on it. Right. A graph, a great way to discover those connections and patterns fraud detection patterns between customers. And I, these can also be iterative. So we have one customer who's their main goal is customer master data and it will take them years it's a massive company. So the graph first in the fact that graph isn't necessarily the tool that's going to cleanse it and master it, but they're discovering those patterns that then can be fed into their rules engine which is their true master data management. So it was kind of a good, you know, both their tools in the toolkit they will continue to use graph as they build out back to that you know can we wait and get all the data perfect before we do anything unfortunately you can't. They're kind of doing both in parallel. They're smart enough to not be necessarily pushing the, they can make some educated guesses on single view of customer for graph for some analytics for marketing. They're not yet pushing that back to your source system, you know I don't want my, my bill being wrong or my invoice being wrong as a customer because you made an educated guess I want you to know it's me, you know, think of healthcare even even more so right so some interesting use cases and use them or evaluate them all. Right, and we all get stuck in our ruts and I would say you know if any of these are new to you so much information out there on the web on diversity on, you know, on good old YouTube. I'm not going to end your week on a data management YouTube videos, but there's a lot of great and the vendors themselves having knocked them in the beginning. I've learned a lot from the vendors a lot of them have a lot of great information about their kind of tool patterns. In a typical organization, there are many use cases many data models. And this is just really even just a subset of some of the options but you know as we think of patterns and that's how I think if we think of that that that I'm trying to use my words of the report or the analytics, you know visualization in the very beginning of the presentation that might be a starting point maybe your bus matrix is a starting point maybe it's a, you know, user stories or design thinking workshop or something with your users really understand what you need, and then kind of work backwards into, you know, what do we need to support it. Again, if you start I always think left to right is kind of a, where housing kind of person, but if you're your operational systems are on the left. Again, I might at the bottom, have an operational system that is my accounting system may be on a relational database or my CRM system. Why are they on a relational database because those are really good for that application I want my bank to know that my, I have the right number of accounts and I have the right amount of money in my bank account. I actually have a financial institution that just gave me a set of stocks that I didn't purchase I'm still wondering what to do with that. So, in one sense I'm glad that I have a whole new set of stocks in my portfolio, but longer term, I'm a little bit nervous working with that financial institution what sort of governance they have, you know, as soon as their relational database but who's governing that how did I just end up with stocks in my portfolio. I shouldn't have said that they don't come find me. So, you know, that's one of the great mistakes of a web application. Again, maybe that's more of a no sequel or you know there's so many different applications. I probably wouldn't put my everything on my web and a relational database. And then how do you kind of move that maybe analytics and reporting it and not necessarily the ETL or, you know, ELT or how you do it but that data exchange. You know, JSON XML there's a lot of different patterns of how you even format the data for you either exchange to another in-house system or outside the organization a lot of industry organizations have kind of XML standards or JSON standards and things like that. And then how do we store for analytics and reporting. A lot of use cases if you notice at the bottom though, I am a fan of having some sort of master data or hierarchies or data quality, kind of as that hub that can feed some of those maybe your source is kind of a source of truth for some of that data. But when you're thinking of operational you really want to think of what are those master data. And because and we already saw in the examples almost everything we're reporting touches. The reason it's called master data and reference data is because it's used everywhere. It's that master source and it's referenced by a lot of different areas. So, you know, kind of starting with that is going to have a lot of benefit because then where am I if I want to do a star schema. Those dimensions can come from there. I want to make sure my operational system. You know, my stock trading company knows that I'm Donna Burbank that owns AT&T and Donna Burbank somewhere. We lost some stocks because they're there in my portfolio still flabbergasted by that one. So getting that right is only augmentative is that a word that only only can add to these storage areas for analytics to be important. It doesn't have to be just one right. And again, the negatives of data warehousing, which I'm a fan of and it works and it still will work. So what is, you know, is everything a monolithic warehouse, maybe it's yes and right we have the financial reporting we have the corporate warehouse, but maybe I want to do, you know, a graph storage for somewhere else or, you know, data vault for particular use case it doesn't all have to be the same. And then even on the consumption patterns, you know the good old cube, which is your kind of star scheme approach your bus metrics type approach already mentioned kind of flattening it out. You know, kind of time sequence data sets for certain use cases graph databases for different patterns etc. Again a lot of different choices and I can't say that any of these are inherently good or bad, but mapping it out. Again, it's been some good chat often data quality problems are either caused by things nothing to do with how it was stored or gave the definitions could be could be how it's stored in separate places, etc. But just mapping it out making sure you're doing it intentionally can console a lot of these problems. We had one whiteboarding session with a customer the great sense of humor, and then we were mapping out, you know, both the process and the data and you start to map out your data we've all seen that you know the lines and the lines you know systems going back into and going out around she was wow when you map up what we're doing we look ridiculous don't we. And again, I don't think anyone decided to have these these lines going all over the place but that's the problem. No one really did decide so you know even if it's a quick whiteboarding session on a back of an envelope, still better than not doing it at all. So, really, really thinking this there was only a positive. So, I do want to leave it open for for questions because I know there generally is a lot and you guys are never a shy crowd. So just quickly in summary, you know, analysts reporting growing and growing and in importance that strongly architecture, which is not one monolithic choice it's a bunch of kind of dairy say a fabric or mesh of different objects. And there's a lot of different choices that you can do just focus on those core fundamentals, I am as guilty as anyone else. There's a new cool technology and I want to use it, and I'll play around with it and that's fine to play around with it. But just make sure the use case you're using it for your enterprise is the right one, and just choose that wisely. So before we open it up for the, the chat in the q amp a just a blatant plug for the next month session which is on metadata management. Another blatant plug we do this for a living so if you need help think of us at global data strategy or website is there. And without further ado, I will send it over to you Shannon to open it up for q amp a. Donna, thank you so much for this another great presentation and thanks to our attendees for being so engaged in everything we do here so if you have questions feel free to submit them in the q amp a portion of your screen for either Donna or Abbey. And just to answer the most commonly asked questions just a reminder I will send a follow up email to all registrants by end of day Monday for this webinar with links to the slides recording and anything else requested throughout. So, is logical like data views. I'm sorry I don't understand that question what was that. It's logical like data views logical. Yeah. Yeah, yes and no. So, when I was saying when I kind of had that data model slide. Often logical is sort of views as kind of a shorthand for the business view so in that sense, yes it can be a view on top of your data but logical and it was more, you know, can customer have a product how do customers and products relate together. I think of you, and then one of my buzz words I think I had up there was kind of almost more your more semantic layer. Right, which is kind of driven by that logical view so if my logical view is this is how customers and products and locations fit together. So you might generate some business views that can kind of help drive that semantic layer which which may drive some views so that's kind of your, you have your. So yes and no you have your physical layer, and then you have a layer on top of it that may be instantiated through a view. The view itself is at the physical layer, but kind of that logic or the business logic is at the more logical layer if that makes sense. I mean feel free to jump in on any of these questions or you want to. If you'd like to add. And are there a good set of question standards for evaluating what is the best solution. That's a great question. Not anything I can show off off hand. But, I mean, some would be, you know, the question I have to start with one of the questions you're trying to answer. You know, and then that's where you know if the person saying, I want, you know, the one I keep saying you know products by thing you keep hearing this word by that kind of leads towards a. I did the warehouse it could be I want, you know, so one of it is the type of question that's asking asked the other one can be kind of the when you know I always go back well there any question the who what where why when. So the what might be I want, you know, sales by customer, the when might be I need it, you know, by the second stock prices so I can make informed decisions well that might be maybe a data streaming, you know, and so kind of think of it or maybe some of the best and user stories, you know, as a user, I want to, you know, understand my financial things I need to report to the street well that's probably a data warehouse as a user I want to know instantaneously, when there's a cab available at the or an Uber available at the airport. Well that's probably something much more real time or you know that type of thing so kind of a lot of tools in the toolkits but kind of probably going back to the who what where when and why and some of the what might even be what am I trying to get if it's in a relational database and I'm summarizing it again that might be a great warehouse. If it's sensor data from our machines, and I need again real time or something. That's always, no matter what question is asked I like to go back to the who what why when, and that's kind of a probably as simple as to get there helped. So, um, there's, so what is a typical startup time for beginning to use no sequel options looking for to looking to augment relational databases but enterprise raising concerns about time to value. Oh, that's a tough one. I mean, often what what, you know, is probably a good way to do it this kind of exploration phase and then trying to scope into some sort of quick win. And you know that from your company and what what does fast look like and that that's something again a question I always ask it and you, you may just have this inherently in you because you work for the company, but I as a consultant you know what is quick and some folks will say oh a year, you know some folks say a couple weeks, and so kind of defining in that nerdy data model that always clarifies what do you mean by, and so what do you mean by quick. And then, you know, maybe a set of a proof of concept sandbox being very careful that sandbox doesn't become production. So one way we can do is this is this technology even suited for purpose so start with that to answer those questions. And then one way to kind of, you know, make sure you can get faster time to market just scoping some sort of pilot that's small enough that people can have some value, you know by limiting use cases or limiting volume or something so it's kind of a hard question to answer because it's different for every company but I would say try to get something out in a few months that's at least a piece you start to show people, and then build for it rather than build over time and again in an architected way. Make sure your sandbox doesn't come production but try to just limit the scope so you can so some value and even see if it is the right fit, you know face exploratory. It's a big that once at the end of the exploration you're like nope, you're in. That's not going to work. I hate the term fail fast and rather succeed fast but you know you're trying to scope your use cases small enough to test that out. I mean anything you want to add to that. No I think that was a pretty self sufficient. I love it. I'm a request here for a topic. We will be putting together topics for 2023 season here soon but exception on differences between data mesh like lake house fabric etc would be helpful, especially for real type life examples can be shown. Most articles are so confusing everything looks all the same. Anything you want to comment on that Donna. Well, maybe it's a plug we just did. What was it, you tell me. Shannon it was a revisit data architecture online. We had a, we had a keynote panel just on that topic. And so maybe we can send that in the link. I don't know if that's available to everybody. But yeah that's a good we are going to be putting together topics for next year so I both love and hate the buzzwords but they are out there so maybe some clarity on what we will be featured in this year's data architecture online as well. Pretty consistently throughout the throughout the day. And also, so does data like provide value if you don't have a quote unquote big data storage need data scientists or analysts wanting to pump in external data and do ad hoc analysis. I think so. In fact, a lot of folks the data lake is almost your just your landing area you're going to question landing area the warehouse. And it could just it could be text based data it could be non textual data it's it's it's a, you know that that need for I don't know everything I want to model before I even start. I want to loot you know that the beauty of a warehouse is I have it planned and organized ahead of time but you don't necessarily know all those answers so kind of having at low cost way to, you know, but it's like your closet you don't have too much in there becomes garbage but Yeah, that idea of kind of a data warehouse. I mean I'm sorry data lake architecture or, you know, so especially the cloud systems kind of have that idea of just that that that storage. A pattern is, I would say yes you don't need to even have big quote, big data to use a data lake architecture pattern. One of your diagrams pick when where you land in from where do you land information architecture. Where do I land information architecture. There's another word I could have added to that. I would say I mean data architecture and information architecture co related I guess information would be more broad, you could have documents in there you could have so I think probably my my diagrams would you could say their information probably aren't as broad because I don't don't have things like documents I didn't have everything in there. I also would say information architecture should definitely look at the semantic layer. I see this information is more holistic than the data which I tend to think is more of like data and database or data and streaming and things like that. So it looks like everyone's been. Oh, we got more questions come in. But couldn't information architecture not influence master data management. Could it not there was a double negative there that confused me. It could. It could. I guess my definition of information is this broader than data so it could influence master data management, or it could, if you're just say master information is just documents you may not have master data but you would certainly have referenced data. And that is that it's kind of having your reference data is your hierarchies kind of your, your tags for documents and things like that so I think they should still be related. I also just, you know, sometimes we talk of circles with information versus data versus knowledge versus. But yeah, again mapping it out. I could be neutral what you call it but yeah I would say data is a subset of the information. That's all the questions we have coming in I'll give everyone a quick moment. I was going to say that'll be a first. I know. So, which technical certification do you consider, if any. Oh, there's a lot out there where there's the, you know, the data. I think it's the Dama CDMP, which is kind of your certified data management professional I like that because that's your, your sort of broad TMI has some great certifications, especially around data warehousing. I also again, we both love and hate the vendors. I do think a lot of the vendor certifications, you know, are you cloud certified I guess I can say some of the names are big enough, you know, AWS and Google and Azure. I think and we do a lot of hiring. The perfect architect in this world can do both right I know the fundamentals or even just university courses and relational algebra and some of those theories. If you know that and I and I know what master data if I here's my perfect candidate. I know what relational algebra is I also know, you know, data science fundamentals and graph patterns and all of that so I know like the academic principles. I know enough of CDMP that I know the difference between data government data that master data and reference data and I know the dimensions of data and I can open up something like, you know, Azure and understand all of the, the architectural patterns and be able to know when to use a key value versus document versus, again, I'm talking about a unicorn. But I think some sort of mix of those of the theory and some of the tools because you know the tools are hot and they have their own. And I would almost question someone that you know said they're an architect but doesn't have any certificate like doesn't have never used one of the platforms like you got to roll up your sleeves and I've done it right so hopefully that's a good good mix and again a lot of it is out there and a lot of it is very low cost or some of them are even free so Hopefully that helps. Very helpful. Thank you and perfect timing because that's going to bring us right to the top of the hour here. Thanks to everyone again for being so engaged in everything we do love the chat that's been going on as always. And just a reminder again I will send a follow up email to all registrants by end of day Monday with links to the slides and links to the recording. Thank you to Katana graph for sponsoring today's webinar thank you Abby for joining us a pleasure and hope you all have a great day. Thanks everyone. Thanks everyone. Bye bye.