 Hello and welcome. My name is Shannon Kemp and I'm the Executive Editor for Data Diversity. Today Donna will discuss why a data model is an important part of your data strategy. Just a couple of points to get us started. Due to the large number of people that attend these sessions, you will be muted during the webinar. And we very much encourage you to chat with us and with each other throughout the webinar. To do so, click the chat icon in the top right corner of the screen to activate that feature. And for questions, we'll be collecting them by the Q&A in the bottom right hand corner of your screen. Or if you'd like to tweet, we encourage you to share our questions via Twitter using hashtag lessons data modeling. As always, we will send a follow-up email within two business days containing links to the recording of this session, additional information requested throughout the webinar. Now let me introduce to you our speaker for our new modeling series, Donna Burbank. She is a recognized industry expert in information management with over 20 years of experience in data management, metadata management, and enterprise architecture. She is currently managing director of Global Data Strategy, an international data management consulting company. Her background is multifaceted across consulting product development, product management, brand strategy, marketing, and business leadership. She has worked with dozens of Fortune 500 companies worldwide in the Americas, Europe, Asia, and Africa, and speaks regularly with industry conferences. And she just released a learning plan course on metadata management with us on DataVersity, so we'll get more information to you on that. And today, Donna is joined by a guest speaker, Nigel Turner. Nigel is principal consultant in AMIA at Global Data Strategy. He specializes in information strategy, data governance, data quality, and master data management. With more than 20 years of experience in the information management industry, Nigel started his career working to improve data quality, data governance, and PRM within British telecommunications, and has since used his experience to help over 150 other organizations do the same. And with that, let me turn it over to Donna to get us started. Hello and welcome. Thank you, Shannon. Always a pleasure to work with DataVersity, and I'm really excited about this new data modeling series. So I think Shannon already gave a great introduction. But if some of you may know me already, I see some familiar names on the slides. And if you're in data modeling, you may have known me from a couple places. I worked for many years doing product management at your studio, and did their process modeling tool and their data modeling tool as well. I was in the OMG doing some of the BPMN standards there. I also spent many years at Irwin, so many of you who are Irwin folks might have known there. I've probably been with most of the data modeling tools out there. I also had a stint with the CA and Platinum Technologies Metadata Repository. So as Shannon mentioned, metadata is near and dear to my heart. So also coauthored two books you might be familiar with, with Steve Hoberman, another great data modeling expert. One is Data Modeling for the Business, which is a great kind of overview. It actually fits a lot of what we're talking about today on basically why a data model has business alignment and how you can help it with the business side and not so much on the technical side. If you are looking more for kind of a how-to on data modeling, I've also coauthored a book again with Steve Hoberman on data modeling made simple. It is specifically with CA Irwin, or formerly CA Irwin, but it's a good how-to on just data modeling in general. So I am on Twitter, so at Donna Burbank. And there's also a hashtag for this, which is correct me if I'm wrong, Shannon, lessons data modeling without the in in it. Shannon mentioned that Karen Lopez had previously done this. I am not as good at Karen for tweeting at six different places and doing a marathon and cooking eggs at the same time, but I am pretty good at trying to monitor it. So if you do have a question or a comment during the presentation, either give a shout out to me at Donna Burbank and or use the hashtag lessons and data modeling. I am joined today by my partner in crime, Nigel Turner, who is a partner at my company, Global Data Strategy. We do a lot of strategies. If you didn't get by the name, that's the topic of this presentation. And he's worked with me on several projects, doing data modeling and data definition and data quality for some of the efforts we'll talk about. So I thought it made sense for him to join today. And Nigel, I'll let you introduce yourself. Yeah, thanks Donna. Good morning, good afternoon, good evening. And depending on where you're listening to this webinar, I think Shannon introduced me far better than I can introduce myself. I suppose the only other thing I'll add there is, like Donna, I'm very active in the Data Management Association and I'm currently Vice-Chair of Dharma in the UK. But unlike Donna, as you can probably tell from my accent, I'm not based in the US but in the UK. And if you follow the news, I'm sure you'll be aware that the UK has been pretty quiet in recent weeks. We've just had the resignation of a prime minister. We've left the European Union. And the opposition Labour Party is at civil war. So it's pretty good today to focus on the relative certainties of data modeling and data management. It all seems so calm and sanguine compared to what we're putting up with in the outside world. So what I'll do now is calmly hand back to Donna. We'll put today's webinar in the wider context, Donna. All right, thanks Nigel. And Shannon mentioned the data modeling series. I just want to give a quick discussion on the lineup there and just give a shout out. Many of you might be familiar with the data modeling series that Karen Lopez has ran for many years. And I've been a guest speaker on that and probably listened to a lot as well. I'm a big Karen Lopez fan. Karen Lopez has moved on. And I'm taking over her professional reigns. And so this is the lineup for this year. And we're always looking for new topics. So at the end you'll see if there's something on here that you're just dying to hear about data modeling, you know, either we can cover in a future series and or blogs and or there's a lot of content. Shannon always tells me that data modeling is one of the most popular topics at the university. So happy to give you more if you're looking for it. So this month is why data model is part of your data strategy. Shannon and I went back and forth on topics. The beauty of data modeling and I'll talk more about that in the session is that it does so much that we're going to talk today a lot about the business side of it and how it can really do that comprehensive from the business down to the technology. And so picking topics we're trying to do a mix. So in some cases it makes sense to get very granular. How can we do data modeling for XML or JSON? Some of the emerging technologies, maybe more broader. How do we do data modeling with metadata management? So you'll see that for month to month we try to break it up. Some get down the weeds, some get high level and we just wanted to make sure we cover something for everybody because data modeling is so broad. For time to time like I'm doing today, we'll have guest speakers on the UML. I'll give a call out because I think they're both on the call Norman Doust and Michael Blaha. Two excellent data modelers, especially when it comes to UML will be joining me for the September one. So please join as many as you can and stay tuned because I think there's some good topics coming up. Next month is big data which should be fun. So what we'll cover today, what a data strategy is. We've got a lot of questions about that. I think probably second or up there with data modeling in terms of popular topics is data strategy which I am pleased to see. I think it's a good and as I mentioned at our firm we do a lot of that of companies are saying, how can I start with the overall picture before I start any particular implementation? And that's a great thing. We'll talk about how you can use the data modeling for that top down business requirements as well as the bottom up technical landscape. And then importantly how data modeling fits with a lot of these other data management disciplines. So without further ado we'll jump in if I can move my slide. So this is our framework for a data strategy. So everyone has their own definition of a data strategy. This is ours. It really depends on what you're doing in an organization. So I'm a big fan of starting with these top down business priorities. So I think any company before you start anything with data it should be obvious but why are we doing this? What is our business strategy? Are we trying to get into a new market? Are we realizing that data is our asset in the organization and we need to manage it better? Do we want to optimize our business processes through data? One of the reasons I'm still in it is that it is so exciting right now with things like big data and IoT with all these new technologies and I think business people more and more are getting that and are seeing data as a valuable asset. I think where it breaks down I think a lot of business people have an idea of these are some neat things I've heard about with strategy that get a little confused towards the bottom with how do we actually implement it and that's where IT can fit. So having IT and the business work together is a huge asset when it comes to a strategy where data model fits in and we'll go through each of these layers in the presentation and relate that to both data and data strategy. I'm a big fan when I do a data strategy to always start with why are we doing this? Again that seems a lot of obvious but it's amazing sometimes people do hey I want to do a big data project. Why? Well because I read about big data and it sounds neat. That's probably an extreme but unfortunately sometimes it is not. The other side is that bottom up so what do we have today? As you know with data modeling that's a great thing a data model can do you know SQL server and Oracle and some DB2 database that Joe wrote 30 years ago and no one has any idea what's in there. So how do we get that inventory if not only the legacy system is for some of these new big data systems or unstructured data. How do we give that some structure in a data model? And then where data models apply is all these layers we'll go through. So again if we're going to integrate the data you definitely need those data structures and the metadata around it if you haven't got that I'm a big metadata fan and that's one of the beauties of data modeling is that that metadata both technical and business can be stored. So why are we doing this integration and data asset planning is the stuff in the middle that gets a little more interesting. To me this is where the business and technology start to mix. When you're doing things like master data management or data warehousing or BI that's where you're trying to get that valuable nuggets around information. How do we get a single view of customer? How do we see trends in our customer buying patterns? It's kind of the fun stuff. And then do we have the quality around it and the modeling to make that happen? Data governance Nigel's been doing this for many years and then he'll probably talk about this later in the presentation. And a data model is a huge asset in data governance to actually make this actionable. But as you know governance is a lot about the people and the process and the culture and building a data model to help enforce that. So we'll kind of go back to this at the end. This is our framework that I like because it really starts from everything from what are we trying to do as a business to what is our existing technical landscape and then how do you manage and massage and make interesting all this stuff in the middle. I'll just talk about this quickly but I'm passionate about it so I need to cover it. But again one of the reasons I'm still in data is I think my first degree in university was economics and I started out as an economist which I'm now finding as a data scientist. We did a lot of statistical analysis but part of that was going through data and doing software analysis around data and then I went to the data side but I still have that business side of me that I enjoy that and I think in data now that ability to meld the two is even bigger than ever. So I find two ways you can really, companies are looking to transform their business using data and this is exciting in the business. I see it in many ways but kind of two camps to that. One is more optimization. How do we become a data driven company? How do we build better marketing campaigns by understanding our customer, getting a 360 view, getting competitive information, et cetera? How do we build better products by understanding patterns? Better customer support. We look through support logs and link that with customer data, et cetera, et cetera. That's taking what you do and doing it a lot better through data which is exciting and a lot of companies are getting that. I think what's even more exciting is this transformative nature and I've worked for several organizations across different industries that say the signs on the wall that I wish I were there to have come up with that but they came up with it before they asked me it is that we're becoming a data company and I think what a lot of people are realizing whether it's from energy companies with IoT to telco with all of the cell phone log data and usage data, footfall analytics to insurance with all the data they collect on their customers. Can we monetize some of that? Is data now the product? Think of Google. Data is their product. And so I think more and more companies are realizing that data itself is a valuable asset. So maybe how do we do something different that we hadn't before or augment our existing business model? Think of telco. A lot of what they're selling is data. In a way, telco, the network, is a bit of a commodity now. A lot of people can do that. You want to optimize the network in terms of the optimization but how can we use that data to do something different? That's where I think if we get the data strategy right, IT can have a seat at the table and business can do some interesting things. So that's where I find strategies exciting of can we mix that business and IT and really create something new that wasn't there before. So I'm going to pass it over to Nigel to talk a little bit about, you know, it might be new to you. What are we talking about when we talk about a strategy? So Nigel, I'll pass it to you. Okay, thanks, Donna. Donna highlighted earlier, I think, what we regard as some of the key components of a data strategy. But I think it's important in the context of what we're talking about is that everyone understands the relationship between a business strategy and a data strategy. And as Donna said, in any data-driven company, these two strategies now become increasingly interdependent. So what we've done here is just put up some pretty simple definitions of what we think a business strategy is and what we think a data strategy is. And if you look at those definitions, you'll see that they've got quite a lot in common, both of which are fundamentally plans of action or roadmaps usually encompass a two to five year timeframe because these days that's about as far as a horizon that can be realistically anticipated or planned for, given the pace of changing both the business world and, of course, in new technology. Basically, a business strategy lays out what the business is trying to achieve and how it intends to do it. And from this, there should be some very clear actionable goals and objectives. And it needs to take into account in its business strategy, external trends, which impact the organization, for example, what is the competition doing? How is the market changing? But also internal drivers. So what new products do we want to put into the market? How are we going to make our processes more efficient? What new skills do we need, et cetera? So that's a business strategy. In a data strategy, you know, in many ways very similar. A data strategy should also be a plan, which means that it should contain clear and measurable goals and objectives. I've seen some data strategies that don't work simply because they are statements of intent without any clear milestones or clear activities contained within them. And the difference of the data strategy is about how the data used by an organization, whether it's internally generated or externally sourced, should be managed and enhanced to make it fit for the changing purposes of the business. How it should be controlled, how it should be made secure and how both business on IT should be looking to maximize its value and exploiting it as fully as possible. So there are clear links between these two things. And if you look at the relationship between the two, I think probably you could have argued a few years ago that data strategies were very much the subservient partner. The business would set its strategy. That strategy would inform and guide the IT people very often who were responsible for the data strategy. They would put some sort of data strategy in place and then in turn, and that would basically be it. But I think in the day of the data driven enterprise, as Donna said, then clearly these two strategies must be joined at the hip. And in fact, rather than the data strategy simply being the passive recipient of a business strategy, it should be actively part of the business strategy itself in terms of evolving the business strategy. Give you an example. The business strategy might say we want to sell a new product to our top 100,000 customers. That's a file and that's a clear business objective. But the data strategy must therefore demonstrate how you're going to identify at least top 100,000 customers are. How are you going to target them? And how are you going to effectively market to them? So it's very much become a two-way dependency and an integral dependency. That means as well, of course, that new technologies and new use of data can actually suggest new things that business can actually do and achieve. For example, selling more via social media channels, creating customer communities of interest, or simply using technology to better analyze customer behavior until the organization can react. But they must be joined at the hip. So what's the relationship between these and data models? What I try to do here, and I would stress that this is very much a simplified view and that in any real organization, obviously the derivation of a data strategy is usually a lot more complex than this. But I think what it does do is highlight some of the key activities that need to be carried out in order to ensure that you have a joined up business strategy and a joined up data strategy. The other thing I would stress as well is don't read this from left to right to some sort of waterfall process because it's clearly iterative, as I've mentioned earlier. So you arrive at an initial data strategy, you revisit the business strategy which might in turn change the data strategy which might in turn change the business strategy. So the two things definitely come together. But I think what this simple diagram does do is it suggests that if you're going to have an effective data strategy, you need two primary starting points. I think first as Donna has emphasized as well that the business strategy, the business goals and also from that the key data needs that the business has to achieve its business goals need to be known, understood and clearly laid out. But then second starting point underneath is clearly as well in this strategy, in a data strategy, you need to know what data you're actually going to encompass within the strategy. I mean every organization has more data than you can throw a sticker. So what's really important is that you focus on the data that really matters to support the business. You prioritize your efforts on that and then similarly once you've done that prioritization you then make the relevant investment of time, resource, money into enhancing and improving the data that really matters to the business. And I think that's an absolutely critical thing to do in any data strategy and sometimes where I've seen data strategies go wrong is when they fail to do that. But if you try and have an all encompassing data strategy that changes all data within the organization and by doing that in effect you end up changing nothing because it simply gets spread too thinly. So the key point about all this is the use of identifying clear data priorities, being very clear about what that data means and we'll come back to that when we talk about data definitions understanding how good that data is now, how it's held, how it's stored, how it's accessed. Then defining to meet the business strategy what that data needs to look like. So the data strategy in effect becomes a roadmap to link the to be picture of the world with the current picture of the world, the activities that that involves. So what I will now do is go back to Donna and Donna's going to talk a little bit more about that bottom bit defining the business data model and how that's done Donna. Great, thanks. So as I mentioned the beauty of a data model is it has both that top down and the bottom up. So I'm going to go a little more deeply on what that means. So if you're in data modeling this type of pyramid probably makes sense to you but I'll go through it quickly. So the data modeling isn't just one thing. There's different types of data models and different audiences. So and people have different terms for all these actually and in the data modeling for the business book we did and Chris we did a survey and there was no one consensus but the general consensus with conceptual physical in terms of the names. What they do at the conceptual level really you're defining the business concepts of an organization. So the audience here is business stakeholders and I listed that explicitly because the purpose is to communicate and define key business terms and rules. At the lower level you have a physical data model. It's technically implement a physical data base and your audience is going to be DBAs, developers, technical folks and where I've seen things going wrong is generally unless you've got a very technical and very interested business user don't show them the physical data model and I could have a whole presentation on this catering the model to your stakeholder. But even at the logical level so that's that sort of nice mix of understanding a little deeper than at the conceptual level it really is as Nigel mentioned defining what you can't define everything in the organization. What are your key business concepts or key data that drives the business and what are the definitions of that and if you've gone through this you'll realize there's some basic definitions that might need to be clarified. What do you mean by something as simple as month? Is it fiscal month? Is it calendar month? The number of customers I've gone into with severe data problems has that, you know, it's common definitions that each term, each group has different financial results because they were using different terms of region or et cetera et cetera. So this is very important and those definitions are important. At the logical level you go one level deeper so you are defining things like data entities and attributes and the business rule that's sort of your first cut for that physical database. You are sort of starting to think at the physical layer. It isn't a physical model, there's clear differences but it's somewhere in between and I will not necessarily show that to a business user. Generally the audience is more of a business analyst which should be able to understand both the business and IT or a data architect or that type of team. So we'll show some examples but it kind of shows you also see that this is a pyramid so in terms of volume when you get down to the physical layer yes you may have thousands of tables for necessity because that's what's running the business. When you get up to the conceptual layer simplicity is key. There are a lot of basic concepts that run the business and it really should fit on one page. We've worked in some large international organizations. Maybe it is the top 20 entities that we're really using to run the business. It might be slightly bigger than that but the goal is simplicity not complexity here. So a picture is worth a thousand words. I hear some examples of what and again I think especially at the and I'll talk more about this at the conceptual model level keep it loose. The definition in getting those business rules and I'll talk more about that when you're talking about a customer here. A customer is a personal organization who's rented a movie within the past year. We sell actually only to people we don't sell to organizations B to B. You called that out because you showed the definitions right here. There's some cardinality, there's some business rules there may be, you might show attributes but again the goal is A defining those key concepts that are important to the business getting the definitions that the customer make more than one payment on a single credit line, et cetera. All of these rules that start to define the business start to be fleshed out here. At the logical model level that's where you are getting more technical. So yes, you do want to have your keys and your data types and your full cardinality probably start normalizing. It's doing two things. It is defining business rules often as a precursor to that physical design. We'll now come back to the physical model level. I did want to talk more about the business model. I am a big fan of this type of model and the first time I did this I sort of even laughed at myself. I kind of call this a graphical data model. Was this too simple? Was my fear? I'm a big fan because it really helps define the business in the big picture and it isn't meant, some of the modeling tools you can put a picture here. I often literally do it in Visio. It's meant as a conversation piece. Again, this wouldn't be your final data model but it's starting the conversation. I've heard terms like customer. I've heard terms like client. When I talk to a support they talk about clients. When I talk about sales they talk about customers. Is that the same thing? Or is it once you've gotten a maintenance contract you're now a client and when you're still a prospect you're a customer. In this case maybe it is the same person. I literally had the same picture every person. A salesperson talks to a customer maybe you should call that prospect. Are there things like the concept of householding? Could you want to get the relationships between this customer is the father of this customer and create those relationships? What are we selling as a product? I've actually been in the room and you start to show this to a business person and their eyes light up. I've had feedback and again it might sound sort of fluff at top but when someone said actually we had a customer like this because you make the customer a little older you should have laughed but not really because they were actually doing a campaign to try to get older, retired, high net worth individuals and that was the customer which was a very different campaign to a follow up campaign that they wanted a picture of someone like this with headphones because it was more of a social media driven type marketing campaign. So this is a data model but it made it very real or you could say product and they say well there's a box we don't sell product we actually sell services we call it a product but it's actually a service. So a lot of this brings it home. I've also used it to fairly business level people that sort of understood the need for data probably didn't understand the complexity and when you start showing these arrows that sort of helps when we're trying to say you need to build something like a warehouse and it's relationships that are complex that really turned the light bulb on for them. So I've found this and I put it in presentations and wanted to take it out and folks said no, no keep it that really explains our business and I often do it myself when I go into a customer and I might not show it to them initially but it helps me understand their business what is their business? I might start to draw out the entities and relationships and often do a picture because it helps me. This really should describe the business. So I'm a fan of this this probably is not your typical third normal form database so the data modeling vendors might cringe but I've just seen it work and I would think about doing this whether you show it to your audience or just do it yourself I think this really explains that a data model drives your business so spend a little bit of time on that but I think it's a little different approach that I thought folks might want to try so I've shared my secret with you now. So we're going to talk a little bit about creative ways. So this was one creative way to try to get the business involved because when you talk about a high level data model it really is whatever way you're going to get those definitions get the business rules from the business. Another great way is whiteboarding that Nigel and I have used so I'm going to pass it over to Nigel to talk more about that. Okay, thanks Dodder and I guess this is about as low tech and approaches you can possibly get to data modeling but I personally know no better way of starting to generate the sort of business concept models that Donna's been talking about. Basically the only equipment you need as you can see there is a whiteboard and post-it notes and a few pens preferably erasable ones so that you can move the post-it notes around and connect them and join them in the different ways that Donna mentioned earlier. And of course the big advantage of this low-tech approach is that you can gather a load of business people and a load of IT people around a single whiteboard and get them to collaborate and work together to begin to flesh out the data model and I found in particular the business people appreciate this and they begin to get modeling and they begin to get what data modeling is all about and why it's important. Of course the other advantage of doing modeling this way is that as Donna mentioned earlier and I've seen this happen as well in companies I've worked in people start to question what these things actually mean so somebody will say well what do you mean by a policy? Well I mean something different. You define a policy as a way of a sort of business rule for how to manage certain data sets I regard a policy as a written document we must adhere to so already you come up with the differences and the definitions which is very useful for reasons that will come too later but the point of all this is it gets people to work together collaboratively and at the end of the exercise which can be completed pretty quickly I've seen a whole business data model fleshed out in a couple of hours using this you've got a clear starting point and a baseline which you can then start to build more conceptual data models. Once you've done that then what do you do next? Well the first thing that you need to do which Donna's already alluded to is in any data model and in any data strategy as we've said earlier you have to start to identify the data that the business really depends on to succeed and what data is critical now and critical for future success as well what data is known to be deficient and needs attention for example what new data does a company need to acquire or create and here's a very simple example of that there's a new product launch that the company wants to do needs to run a marketing campaign and it knows it needs better customer information so in order to make that marketing campaign a success what it needs to do is filter and focus on those elements on those data areas data objects that are really important and there are five that we've listed there and how do you identify what those five are well there are a number of different ways of doing it some of the techniques we've already talked about is one way of doing it another way of doing it is you simply go around and talk to people and you ask them what data they really depend on and what really matters to them so stakeholder interviews stakeholder workshops are always a good way to flesh these out of course another way of doing it is you start analysing business processes and start to understand how key data interacts with the processes that the company is running and what data is it important to get right and what data perhaps doesn't matter too much if it's not right you know if for example you hold customers telephone numbers and you know those telephone numbers aren't very good but you stopped telephone in customers two years ago anyway because now you contact them by social media or email so it doesn't really matter so that's the way that you can start to focus and prioritise and just to give you an example of this Don and I worked with a major UK energy company last year and using an approach like this understanding what was really key to their future business they distilled all their key data down to around 120 objects and attributes so that isn't data entities or data objects that includes the attributes as well and so they decided that if they came to the conclusion if they got those 123 entities and attributes right then the rest could really manage itself and that the whole business and the success of the business and its data depending upon those 120 items and so all their improvement and governance efforts have been placed on that ever since and it's showing awards to them and then the next thing once you've identified what these key data objects and attributes are and Don has already mentioned this as well it's really really important that you get some very clear definitions of what that data means and my definition of a data definition is simply a unique way of identifying describing a key data element or a group of data elements and without having clear agreed definitions of key data organizations are a bit like Broigle's famous painting here of the Tower of Babel they all think they're speaking the same language but actually they're not and it's that confusion of language and cultural difference that can often cause a lot of problems when you're trying to build data strategies and just to move on to the next slide you know here's some examples of things that Donna and I have come across in some of the work that we've done and some of the questions people keep being asked all the time you know what is a household how do we define a monthly calendar Donna has already mentioned that one what the hell is a PEG ratio does that mean the same thing to everybody excuse me and in the energy company that we worked in last year you know we had the classic example of what do we mean by a customer because somebody from marketing who was in a workshop and I was in said well it's someone who we target in our campaigns and then somebody in cell said no actually we define it as somebody who actually buys a product or service from us and then finance said well for us it's somebody who pays the invoices and pays the bills and of course if you're dealing with business customers then those people are often different from the people actually by the service and then somebody piked in from engineering said well for me it's somebody who requires some service and support from our operations team so you ended up very quickly in that workshop with understanding that there was a significant fundamental misunderstanding or difference of opinion as to what a customer was resolving those sorts of issues and being aware of them is absolutely critical when you're actually building a data strategy and there are a lot of other benefits I think that come from this as well and I've listed some of them there and I won't read through them all I think we've already hammered the point home that a good data strategy just focuses on the data that really matters now and in the future to the business of course once you've got clear definitions that helps you to build business rules and therefore enforce data standards on the key data that really matters you can then also decide what data you need to focus on in terms of monitoring it and improving and it helps you to do that as well and obviously as well as if some of those data items that we talked about are subject to legal or regulatory control then that's a good way as well of proving the provenance of the data and showing you're being compliant with law and regulation the other thing we found is very useful as well as you do this is that you can actually publish and make other people across the whole organization aware of what data really matters to the company and actually encourage them to focus on getting that data right so what I'll do now is pass back to Donna again who will explore the more technical bottom-up side of how a data model can help develop these strategies Thanks and so yeah I mentioned at the beginning that they are both equally important and inextricably linked of the top-down the bottom-up so I think we gave a good explanation of the top-down and how a data model can really help prioritize understand, describe, get buy-in from the business etc but that doesn't help unless you really understand your technical environment because at the end of the day that's what's actually running the business so one of the great things about a data model is the majority of data modeling tools out there unless you're using something, you know just like a drawing tool is that it can help create what I like to call an active inventory of data assets but let's just start with the inventory part so at many companies I work for there is hundreds or thousands of databases and there is often the case I sort of joked about it earlier but you know Bob built this database 20 years ago no clue what's in it or think of some of the packaged applications like an SAP where there may have been modifications and I really don't know what that data model looks like and what the business definitions are so most data modeling tools can reverse engineer and create a logical and sometimes even a I mean sorry a physical and sometimes even a logical model to understand what those data structures are so it's very basic of know what data you have what is an inventory of those systems you know a lot of the data modeling tools are very advanced now and you can start to do some rationalization as well because you know often we sort of alluded to this already you might have 200 customer databases and 170 of them define customer in a different way you know even if something as simple as how do I have my name field or my surname field are they done different ways and that leads to data quality and integration and if you're on this call you probably understand a lot of that already but that is a lot of the pain points of a lot of these data quality issues again we can you know so many of these problems seem so simple at the outset you know why do we all have a different definition of customer on the business side but that's just even exponentially true on the physical side and by the way not only do we have a different definition of customer but the way we store customer names is different in 17 different systems so just trying to get that to make sense and creating the standards around that can be a challenge so you know knowing what data you have is creating that inventory and knowing what those structures look like know what your data means so the beauty of these model layers is a lot of these hard work we did you know so just to be clear we're a big fan of sticky notes and pictures and all that stuff to get the requirements but we're not saying that's your final I don't want anyone to go out there tweeting saying Donna says I should build my enterprise data model on a sticky note no that's just a way to get the conversation going and then yes you put that in the data model and there's metadata in those models that only has the physical but the business as well and it supports that data consistency so A, can you identify problems that there's 17 different version of ways of storing first name and then can I create and Nigel will talk a little bit more about this in the governance piece but the we create domains and data standards so that doesn't happen again so there's kind of two ways for a data model there's the bottom up of I want to understand what my data is and create a model from it and there's also the top down of say for new development or data model or data management changes can I define that first name should be done this way or I want to add a certain field I'm going to do it from the data model so that's why I call it an active inventory it's not just a passive picture of your database environment it's an active living breathing document that can actually be used real time so and as I mentioned before metadata is key to adding the context and definition around these and we'll beat this to death this idea that we've already talked about of the definitions you know it's the last name surname, family name you know in China is it maybe different we have the last name first in the American way my listed city is where the customer lives or where the store is located etc so I think we've beat that one to death so I won't continue although it is important but the beauty of a data model is you can mix that with also the technical you know what format is this in is it character 30 what is the standard abbreviation on the physical database so any of you who have reverse engineered from a physical model you'll see that it could be table x3 with columns c1, c2, c3 well that's not very helpful so if we're going to abbreviate name can we all abbreviate it the same way you know are these a required field is it nullable, is it required all those technical metadata are important so there's both business and technical metadata so your technical metadata is your DDL whether this is nullable with what the data types are etc etc and then the business definition are these terms and definitions the subset is a lot more than this and as you know the data is the actual data you know the fact that John Smith is a customer and that he has an employee ID just a little bit on metadata because I'm a fan the other thing that data models can help with is the data lineage so there's sort of the defined metadata that either was defined in the structure of a database that you're either defining in your model or reverse engineering and discovering and then there's sort of a lot of things behind the scenes and again the data modeling tools either integrate with other tools or in themselves are getting much more savvy and how to do some of this in their own tool they have a lot of folks that are doing data modeling or doing it for data warehousing so this idea of lineage you know so for example I have this term sales amount on my BI report on my data warehouse you know it started out on three definitely a subset of a real environment started out on three different databases Oracle SQL Server DB2 we kind of transformed it through TL put it in a staging area created a warehouse what are the rules around that what was the initial field and how did that get transformed so a lot of data modeling tools or metadata repositories can track that lineage as the metadata so that you can see when someone does we're in this business meeting and they say that's not how I define sales amount well we can actually see how that was calculated or if there's a discrepancy well we took it from this Oracle database well that was the sales database I mean that was the support database you shouldn't be calculating sales from that that is totally different metadata so it helps you understand where that data came from when you're doing things like auditing or understanding a data warehouse or these data model design relationships one of the nice things about defining all of these terms in a data model is that you can do that what I like to call a semantic linkage so say for instance in the conceptual model we decide that it was called client and that's what business people use well in all the databases it might be called customer for now we're just going to keep it maybe you want to rationalize these two maybe you don't but at least you can link with the fact that client is customer at a logical level but probably more commonly they have the idea of on Oracle it's custom, Teradata is customer it's C-Table 16 and DB2 so if you create standards or at a minimum you can link but when we talk about customer it's linked to all of these different physical systems and when you do want to make a change you can see that linkage and how it maps to the business terms and that's one of the beauties of doing both the business and the technical in a data model because you by definition the metadata in this lineage is stored in the model you don't have to go back if you take some effort to link these I'm not saying it's just magic press the button you have to have a little bit of rigor in how you build these models but especially at the logical physical level it's very nice to have a map of the term customer is actually implemented technically on these different platforms in a certain way so it's a little bit of the technical that is the beauty again of the top down the bottom up as they meld nicely together and hopefully that gave you some examples of that why do we care? I thought it would be helpful especially when we're talking about a strategy usually you do a strategy for something we're going to do a strategy as a result of that we need to clean up our data quality or we want to build a new warehouse to report on customer metrics so how does the data model fit in that and I think that's where the rubber hits the road and we want to give some examples of things we use a data model for so I already talked about a data model for data warehousing and business intelligence probably a fairly common use case but just kind of full that story because the data modeling fits in in a lot of places and unfortunately we don't always we data modeling people don't always get front and stage center so frustration could be not that I'm ever bitter but the VI tool is a big flashy that's what people see they see this flashy report and some pretty colors and this is the answers they want and all my customers by region well there was a lot of work to get that nice report and the data model that's why I like this term here we're kind of the intelligence behind in business intelligence all those rules all those structures all that hard work that made your nice pretty report hey that's us we're here what's that movie Horton here's a who we're here we're behind your model so the good news is we're necessary the bad news is if you want to be front center you're probably not in terms of what the business sees but you can see the frustration from a business person could you just show me all my customers by region how hard can that be if you want to be front center by this afternoon they probably don't understand that that customer data is in 17,000 source systems you need to rationalize and get it right before you put it in a warehouse and then build cubes and build your report so data model can help out of many stages as we've beaten to death at this point what is the deafness customer is it prospective customers people have already bought where is the data stored yes we can get that information but it's in 17 sources how is it structured it's all structured differently we need to have common rules who uses the data who owns the data if I have a question who goes to it is it private is this PII information all the stuff around the information and that you probably build your conceptual, logical, physical, relational models to find that then you went in and out when you want to get to the data warehouse level that's the good old dimensional model you kind of start a schema or whether you're Inman or Kimball whichever one you wish you could still build that in a data model and that kind of helps different questions what are the definitions of these key business terms do they match these are they slightly different what do I want to report on how do I optimize the database to start building the reports the good news is that a data model really can and should be used at all of these layers to help answer that question can you show me all my customers by region so very, very relevant in that kind of environment data modeling for enterprise architecture I want to spend a little time on this I think my rant and Shannon can hit me with a verbal cue if I go too far because I tend to rant about certain things but enterprise architecture often gets a bad rap it's sort of like I guess business even data warehousing does is it too fast does it take too long well I think all of us in the industry need to be more agile when we talk about things like whiteboarding you know don't go back to the business and say yes we can give you a conceptual model in six months after we've had to take older interviews and design sessions and that's not going to fly whether they ever flew I think people want to see faster results we have to think of being more agile so no building a big enterprise architecture just for the academic reason for it and taking three years to get there I think is not in vogue I think the underpinnings of enterprise architecture very much are so I think like data warehousing enterprise architecture can often be seen as a little too academic in this rapid we need to build something in big data quickly but I think there's some key things we need to do that shouldn't be left out at any phase we might do them more agilely we might do it on a whiteboard but I still think we still need to think of all these processes linkages Nigel mentioned it earlier so this is what we've done and I think depending on the industry the rigor behind this so we've built this soup to nuts probably using every one of these artifacts at the bottom for a pharmaceutical company it was a water treatment plant that were very processed we think of engineering very process driven actually the water treatment plant actually had a great success story that there was an issue with some contamination in one of their systems because they had detailed process models and data models and hours able to pinpoint the issue and solve it because they had everything well documented and you very rarely get great data model and process model success stories like that and unfortunately it's because they had a problem but they actually got a whole lot more funding to build process and data models and link them together because they saw the value similar with the pharmaceutical company they were actually able to optimize their research and development process by taking a look at the processes and the data so I think going that level of detail if you're a sales organization or maybe a nonprofit or something you may not want to go as deeply but you should still do this and I'll walk through what this is so the business view we've sort of talked about that we have some artifacts in our practice things called like a motivation model what are you trying to do what are the motivations of the company what are your business capabilities what are your business drivers this is basically what are we trying to do in what my business looks like process view is sort of self describing but that's the actual process how do we build a product how do we sell to a customer on the website and then they enter information this is key especially when we're talking about data quality often the data quality issue is process did the sales rep enter the right information can we pre-fill that information from external data so they won't make mistakes can we use the data model to do drop down fields so that if the state code is you know a US state give them the proper list if it's yes no build a domain with some values I won't start the rant but I've been I was actually on a data quality webinar where I went to register and had free form text fields for everything even even you know your US state and I tweeted about that one because that just stuff shouldn't happen anymore you're just causing issues and that can all be solved through a data model and then the data view which I think we've talked about the conceptual logical you know business glossary we didn't talk about this specifically but that's often built from a data model a lot of the data modeling tools out there now can take your conceptual or logical models think of your entity and attribute definitions and just publish them so business people can see so you've already have this beauty of data models that a lot of the information other processes need is in the model just publish it and then linking it all together this mapping data to process I've used CRUD analysis where the data is created read updated and deleted very valuable and again whether you do this detail that is to the high level who's using the data when is it updated when is it read how is it used so seeing data and data models in the context of the bigger picture is very valuable and I spent a lot of time on this but I'm a big fan and we've seen success in a lot of industries so I'm going to pass it over to Nigel for some of the other business areas that can use the data model Yeah, thanks Donor and I'm conscious of time so that I'll work through these quite quickly but one of the areas certainly where data models have a big part to play is if your data strategy contains any cloud ambitions are you moving data and or applications to the cloud there are some cloud vendors out there and some exponents of cloud who will say to you you don't need to worry about data models in the world of cloud because you just pass all your data to a cloud provider and they'll sort all that out for you and that's a dangerous fallacy and so data models are really important here because remember whatever your data is physically stored wherever your apps are physically run from it's still your data and you're still responsible for it so if you put personally sensitive data or secure data into the cloud and it's compromised for example ultimately the buck will stop with you and not with a cloud provider and there are lots of issues as well if you put your data out to cloud providers for example who might move that data around because as you know there are the ideas of safe harbors and that's certain personally identifiable information for instance in Europe can only be held within safe harbour companies and safe harbour companies companies and countries I should have said so data models are really important in terms of identifying which data is it okay to put out to the cloud and which data should we keep in house and store in house that's one of the key benefits of a data model in your cloud strategies similarly for application development basically the advent of agile techniques particularly the key mantra of course of agile is reuse and if you've got data models and associated metadata and the sort of business glossaries and definitions that Donna and I have talked about then these are key tools in helping to ensure that the data that you actually hold is reused and not recreated in your various plethora of agile projects that you might be running with the danger that if you let them run in an anarchistic way then you've got un-controlled data duplication data proliferation and general data anarchy and of course all these things can undermine any well-intentioned data strategy and of course with the now the growing development of an adoption of DevOps which is an approach that stresses the need for close collaboration of application development testing and operations the ability to reuse existing data sources is critical if DevOps is to succeed because it's trying to establish this culture where building testing releasing software is all done very quickly very frequently and very reliable so having a good picture of your data is absolutely key to making that happen and then moving on to the importance of modeling in master data management I'm sure most of you on this call are familiar with what MDM is and it's about identifying creating single data sources that are where data is in principle held once and we use many times by many different applications and clearly but I think this one's more self-evident you can't achieve master data management and it should be clear about what data is relevant and spoken to what you're trying to do and what you define as master data what it means its definitions and its qualities like so no MDM initiative can succeed without clear definitions of data what your objects are what the attributes are so data modeling has an important part to play here and also the people for MDM is data governance and you need clear business ownership and stewardship of master data in order to help you to find what your master data needs to look like so the final sort of link if you like with other data disciplines is key one between data modeling and data governance is founded on the pretty key principle some of which you see on this slide the idea of data as an asset needs to be managed accordingly and it needs to be subject to the same disciplines as finance and HR and other activities within an organization and data models can help greatly to to underpin a successful data governance program because first of all it helps you to find standards and domains that you apply the rules to helps you to identify where your data are so for example should we have a data steward responsible for customer data should we have several stewards who deal with that data in different countries for example if you're a global company also helps you with things as Don mentioned earlier about the importance of lineage and data updates so ultimately data modeling is key to a successful data governance program as well and I will hand back to Don kind of where what a strategy is why it's valuable and where a data model can fit at the business level it's defining the business strategy and then using that conceptual data model to prioritize and understand the main business contents at the physical data model level it's really understanding your technical environment and then well maybe it's not too academic but everything in the middle right this whole data modeling ecosystem from the metadata management for your data integration and lineage as well as what we just talked about things like master data management data warehousing really defend depend on a model to make things valuable in governance which to me probably is the most closely linked and we're thinking of having domains and business rules and data standards that's what data modeling is all about and it's been all about for many many years so not only in the technical side but also in the business side as Nigel mentioned with things like stewardship so the beauty of a data model again of your organization and making it actionable we talked about that a bit the top down bottom up and how it fits a little bit quickly about us we do this for living so if my sales pitch if you need help let us know but that really is our passion about how you take data and make it make your business transformed through data which is kind of fun my other shameless plug here's just a contact information I guess a shameless plug to our respective data management chapters if you're not involved it's a great nonprofit organization that's free to join so there's our either Damo Rocky Mountain Chapter Damo UK or Damo.org which is the global chapter and my shameless plug for our metadata course so data diversity has actually a full course load and is building more so we just launched one on metadata management if you're on this webinar you get to use our discount code which is 20% off and get a set of ginsu knives if you register now that sounds too salesy but seriously there is a helpful course on metadata management there's also two there's one on data governance and one on data quality and you can actually use that discount code for those as well so take a look at trainingdataversy.net kind of fun I've actually taken both of the other courses and they're very good so just another plug for the rest of the series I saw a couple questions coming in on data modeling for big data which is next month we'll talk more about that so please join the rest of the series if you can and it looks like we've got a few minutes left for questions and I'm going to actually request questions before we ask the other one so of course if you are I saw some other topics whether we answer them today or next week specifically do you want to hear about big data and data modeling for next month let us know and we've got a lot of topics for next year's lineup or other suggestions but if there's something you're just dying to hear about data modeling that you want us to talk about or write about in blogs just drop that in the notes so without further ado I will open up to questions I think Shannon you are monitoring those yes and what a great way to kick off this new data modeling series this is great presentation from both of you thank you so much we've got a lot of great questions coming in and I will get to those and of course the most popular question the most common question are people asking about copies of the slides and the recording I will send a follow-up email within two business days so by end of day Monday for this webinar with links to both of those along with additional information and some of the information that Don has provided here and the presentation so first question coming in for either or both of you answer how does the logical conceptual models align with no SQL big data database technologies and again as you mentioned we are going to be covering that a bit more next month in depth but is there a quick answer that you want to provide that I will because that's a common question and actually they fit very nicely especially at the conceptual model level that should be completely technology neutral so you shouldn't be at all at that point so that would cover everything right we are looking to get customer information and then when we get down to the logical information that gets a little more detailed and Nigel touched on this as well you don't manage everything in the same way in the organization so when you get down to that logical level that's where you are saying things like if we are going to build a warehouse and we need to have customer name and address I don't know social security number in the US we need to get that right we need to make sure that we are going to have some kind of understanding and maybe some trending analysis on big data very different governance around that and that's really at those levels the conceptual logical levels is where you can start to define the different governance structures so conceptual that should be even if you never even built a database that could still a helpful effort to talk about your data and then at the logical level I think that's where you can get as much as possible as well and I don't want to get too much in the next month but there are high layers again that's part of that logical and conceptual model of once you have everything out on something like Hadoop certain things you do want modeled more closely think of it as a discovery platform this is everything and then what is it that we really want to model more closely and put on the warehouse I agree with that I've seen lots of examples where no SQL databases and the so-called data lakes in Hadoop are rapidly becoming data swamps because that analysis hasn't been done up front and there are no clear definitions of some of the terms and some of the data held within there so then people go in and try and do some analysis of what's actually within the data lake they really don't know what they're finding and the management of these things and governance as well is actually critically important I'll chime in again because you've got a good thing I've been working with a lot of big data lakes and data scientists and when we do these stakeholder interviews that Nigel mentioned actually Nigel in another presentation we had some great statistics the number one complaint from these data scientists is I don't have documents it's great you can munch all the data and do statistical analysis but you still need to look at the customer what do you mean state is it the state of affairs the mental state is it the state they live in all of that needs to be defined and I think even more so now that we're doing this more volumes of data it doesn't go away you still need to do the modeling or at least the metadata definitions all right well we have less in a minute here but I do want to just get in at least a little more question you mentioned a data strategy compared to projects that occur without them did you want to take your statistics man Nigel do you want to take that one yeah I mean to answer your question personally I mean I don't see that a data model in itself or a data strategy in itself has a great ROI I think where you measure the ROI is that if you have a data strategy for example it has as its core a cost reduction strategy maybe putting stuff that you can in the cloud reducing operational costs reducing costs of failure for example in processes by improving data then the ROI tends to come from that I mean there are obviously some of the benefits are listed earlier of having a data strategy and having data models are there I mean you can do things more efficiently as Donna said earlier if you have data models you can design data warehouses much more quickly and of course the whole thing about reuse that I mentioned earlier as well has benefits whether you can get return on investment in what those benefits are because they will vary from organization to organization but it's like everything else if you don't have a plan and you don't have a strategy and you don't have models of what you're trying to do then you can't possibly measure anything but at least doing it this way you'd have a chance of actually finding some real tangible benefits when you start to improve the data as a result of the strategies that you put in place great I think that's a good summary we had there we are right at time so unfortunately we are out of time but I'll get some please questions over to you Donna and if I may put you on the spot here a little bit maybe we can get a couple of answers out into the follow-up email to answer and then if folks want to also we can email directly if you have any questions we're happy to follow up yeah awesome alright well as mentioned I will send a follow-up email within two business days so by end of day Monday with links to the slides and again thank you so much for kicking off this webinar series with us what a great way to start we really look forward to the future webinars and as Donna mentioned make sure you get let us know if there's additional topics you want to see us do next year and if you have any things targeted you want to have addressed in the data modeling on big data next month I hope everyone has a great day thanks everybody and we will see you next month thank you thanks boy