 Hello and welcome. My name is Shannon Kemp and I'm the Executive Editor for Data Diversity. We would like to thank you for joining today's Data Diversity Webinar, Data Warehouse Strategies. The latest installment in a monthly series called Data Ed Online with Dr. Peter Akin brought to you in partnership with Data Blueprint. Now let me give the floor to Britt Hafner, the Webinar Organizer from Data Blueprint to introduce today's speaker and webinar topic. Britt, hello and welcome. Hey, thanks Shannon. Hello everyone and welcome. Thank you for finding the time. In your busy schedules to join us for today's Webinar, Data Warehouse Strategy. As always, a big thank you to Shannon and Data Diversity for hosting us. We will get started in just a few minutes after I let you know about some housekeeping items and introduce you to your presenters. We have a one-hour presentation followed by a 30-minute Q&A. We will try to answer as many questions as time allows, but feel free to submit questions as we come up through the session. To answer the top two most commonly asked questions, yes, you will indeed receive an email with links to download today's material and the webinar recording so you can view it afterwards. These materials will be sent out within the next two days. You can find us on Twitter, Facebook and LinkedIn. We've sent the hashtag, hashtag Data Ed on Twitter. So if you're logged on, feel free to use it in your tweets and submit your questions and comments that way. We will keep an eye on the Twitter feed and we'll include answers to these questions in our post-session email. Now, let me introduce you to our presenter, Peter Akin. Unfortunately, Steven is unable to join us today as he's up in the air. Peter Akin is an internationally recognized data management stock leader. Many of you already know him or have seen him at conferences worldwide. He has more than 30 years of experience and has received many awards for outstanding contributions to the profession. Peter is also founding director of Data Blueprint. He has written dozens of articles and eight books. The most recent is Modicizing Data Management. Peter's experience with more than 500 data management practices in 20 countries and consistently made up the top data management experts. Some of the most important and largest organizations in the world have sought out his and Data Blueprint's expertise. Peter has spent multi-year immersions with groups at diverse as the U.S. Department of Defense, Deutsche Bank, Nokia Wells-Fargo, the Commonwealth of Virginia, and Walmart. He often appears at conferences and is constantly traveling. Now that he's transitioned over to them, here's a bit of a weighted question. Peter, where are you today? This is kind of unusual that we're actually both here at the Data Governance Conference. We're at Data Governance Conference. We've got about 200 people downstairs who are really passionate about Data Governance, and we've been going full steam for a couple of days at this point. We're getting to the point where we're going to be winding it down in just a little bit. But our topic today is warehousing strategies. What we're going to cover is sort of getting the context correct. What is data warehousing in the context of data management? Then we're going to move into a section where we talk about the motivation for integration strategies in general. Warehousing can be an integration strategy. It can also be a BI strategy. We'll look at both of those. We'll look at how those technologies lead you to this type of solution. And then look at three different warehousing architectures that are the three typical ones that are out there. It's important to understand the distinctions between them because we got this question at the session yesterday. Someone said, so why do you make a model when the answer is you're solving a problem? You don't build a data warehouse for the purpose of building a data warehouse. You build it to solve a problem. And each of these three different architectures gives you a different approach to solving problems in here. Then we'll look at meta-models because meta-models are fairly important in the sense that if we understand the meta-models, it means you don't have to start this stuff off from ground zero that you can actually start it off from a way that will give you a good running start to getting moving forward on this. So let's get down to it. We'll finish up with some guiding principles, of course, and then we'll head to your questions and answers and the discussion, which is the part that we of course like a lot on this. So as data managers, again, we believe that data is the most powerful, underutilized and poorly managed organizational data asset. We like to say it's our sole non-depletable, non-degrading, durable strategic asset. So when you talk about it from that perspective, it really does get people's attention because it's nothing else. We treat our durable assets differently than we treat our non-durable assets or expenses. And if we see data as an expense, it becomes a problem that really, again, should be an asset. We see also a lot. If you Google the phrase data as the new oil, you will see it all over the place. Literally, Britt and I are looking out the window onto the Atlantic Ocean and there's an oil tank right out there. That's so bad of an analogy. Please don't use that. And if you see people doing it, tell them to stop. The reason for that is because if you think of data as oil, it's stuff in the pipes, and we never think about consuming stuff in the pipes. The difference between oil and data is that we don't think about oil after it's gone into our cars or vehicles or long knowledge, et cetera, et cetera. And they would use it up. Data is the one asset that we can't use up. The better we treat it, the better it becomes for us in terms of the performing asset. So a better way to think of it is, really, that data is the new soil. And that soil is something that you can plant stuff in, and that investment that you make in the soil will then eventually turn into something that you can use all along time from now. We see this on t-shirts all the time. Data is the new bacon. Okay, if you want to fill it that way, that's fine. But it's not going to help you if you're thinking about it in terms of the consumable because the mission here collectively for us as data managers is to strengthen our organizational data management capabilities to provide tailored solutions that are appropriate to the organization and to build lasting partnerships. Because we don't have lasting partnerships with the business, there is no ability to gain a seat at the table when important decisions are being made. Now, we move from here to Maslow's hierarchy of needs. Most of you remember this from high school. The idea is, of course, that if our food, clothing, and shelter needs at the bottom of the pyramid are unmet, it's very unlikely that we're going to go home in the evening and self-actualize. For me, that's right. My horse is on the weekend and playing a little bit of music, and then I tell people I'm safe to be let back out on the streets for next year. Each of you may have different ideas about what self-actualization is, but from a data perspective, we tend to think of that self-actualization piece as somewhat analogous with this golden triangle that I'm showing now. The idea here is that most organizations really do have a technological focus of these things in the golden triangle, and we've been using these charts for 30 years. At this point, the only thing that changes in them are the words that are on there and the different practices that are out there. But focused on a technology approach is only one part of this problem or another way to think about it is that it's really just the tip of the iceberg. And then if you don't understand that the foundational data management practices need to be put in place in order to properly utilize these advanced data management practices, there is a big disconnect that occurs. I mean, foundational practices are all about organizational capabilities. The better, the stronger, the more solid the foundation that you have down there, the better off the organization will be able to, in fact, make use of its data assets. Something else that most people are unaware of at the foundational level, too, is that these data management practices are only as strong as the weakest link. And in the diagram here, I'm showing you that most of these practices are strong links. However, the practice between data platform and architecture is a weak link. It's actually held together with a tie band. This organization, where we advise in them, would be saying, yes, the entire practice alone architecture needs to be strengthened before any investment in governance, quality, management, and operations are going to produce additional results. In other words, if you don't fix the problems in architecture, you can put all the money you want to in these other areas. It's not going to help the overall piece because the overall foundation of the data management practices here is only as strong as the weakest link that's in the chain. There are questions that data blueprint from time to time where people will say, great, Peter, I understand that, but also I have to have this stuff done by Friday. So can I skip the data management foundational practices? And the answer is absolutely you can. However, whatever you're doing will take you longer, cost your organization more, deliver less in business value, and present greater risk to the organization. And if instead you learn to crawl, walk, and run your way to the top by putting in place foundational capabilities that can then be used to implement advanced technologies. So early from a data governance perspective, we like to talk about data as that reusable resource that we are building. It is the goal of things that occur between when data is sourced, when we acquire it to the important parts in the data lifecycle, and where it is used, regardless of what type of use it is. It's unlikely in your organization that your knowledge workers are not using data. Let me say it the opposite way. That's the definition of a knowledge worker and most organizations that are paying attention to data should have lots of knowledge workers. We need to understand how they use this data, how they use this data to get insight into this process, and understand that overall data governance, data management issues are paramount for the organizations in order to do this. The foundation layers that I mentioned a few minutes ago, strategy, quality, operations, architecture and governance mean that we are now managing data coherently. Typically data is managed in most organizations as a work group level activity. And imagine the power of taking all those work groups and pulling them towards data strategy. But there is now a class of professionals that we can say who are managing data professionally. These governance professionals, there is a profession around it that we can say what data is in fact correct. We can't possibly make all the data perfect, but we can get data good enough to use under most circumstances. And the appropriate architecture and implementation pieces need to be in place as well as the supporting processes in order to actually make this work. You may remember a couple of months back, Melanie and I gave a fairly deep talk on how to use these techniques to do this. If you're not familiar with that, it's on the website at Data Diversity. You can go back and look it up and I'm sure we'll come back around and do an updated version sometime in the next year. So what you're seeing here now is the DEMOC wheel as the body of knowledge and this body of knowledge has one of the tenants in it, the data warehousing and business intelligence site. So that's what we're going to be talking about in the context of these other places. And just as a by the way, we're hoping that everybody works in the area here to get to the DEMOC working and to start becoming certified with the CBMC. Now this chart that I'm showing is the DEMOC version of the iPod chart, the overview of what it means to be doing data warehousing and business intelligence. And the idea here is that the inputs on the left-hand side, the activities are in the teal box in the middle, and the outputs on the right-hand side. So this is an iPod diagram input process output. And of course we have to add participants and tools, but what we're really trying to do if you look at the goals is to support and enable effective business analysis and decision-making by knowledge workers and to build the infrastructure to support that so that people get the idea of looking into this type of technology as a way of supporting decision-making in general as opposed to a one-off type support. And the definition here is planning, implementation, and control processes to provide decision support data and support knowledge workers engaged in reporting, et cetera, et cetera. Now this context here, again, assumes that you have got an overarching data strategy and that your strategy is now looking to start becoming familiar with things like big data technologies on this. If you're not, don't have a strategy. Of course, it becomes very difficult. Any word will take you there. So the key is to make a focus of combining these foundational practices along with the Denmark wheel and saying how can we use these to pull together to support these. Now let me just sort of stop here for a minute and give a little bit of explanation because one of the things that happens to many organizations is that they look at data warehousing as a goal into itself. Well, that is certainly an interesting perspective and a useful perspective. It's also interesting to take a look at it and say, wow, data warehousing really represents a failure to understand the design of the organizations in larger contexts. But if we had properly done this, the applications that we would put in place would give us decision-making capability to make the decisions that we need to have. But we didn't see that far when we were building them that we typically have only gotten a little bit better at doing that process. So when we think about it, data warehousing really is the idea that the most data that's occurring in your organizations is what we call transactional data. They represent a purchase, a telephone call, a communications connection, a hiring, anything along these lines or something that can be traced back to a specific event at a point in time. What we need to have is management, though, is much more of trend analysis. And so we can't look at the transactions. It's literally a case of looking at the forest and seeing only trees that we aren't able to, in fact, step back. So the motivation for this is really to say, how do we change the way in which we've been thinking about these things? And the way we've described the evolution of systems in the past is largely a process of paving the cow pastures. People came up with a payroll system, and the payroll system had payroll data with it, and the marketing system had its marketing, and it then had marketing data, finance, et cetera, et cetera. And this worked really well as long as we stayed within the silos, but as businesses realized that they really wanted to connect information from system A to system B and integrate it, first thing that they had to recognize was if these systems were not explicitly designed to work together, the chances that they would happen to work together are absolutely zero. And so, consequently, organizations now have a challenge of how to get out of this glory and not the legacy environment that they are faced with. And by the way, by definition of legacy systems is anything that's already in production. So the overall approach that organizations would love to take looks something like this little bit of information here, which is that we're going to start to put data in one place. And the idea is that a single source of truth should be a way in which we can go about this and that we'd like to start collecting this data, little bits of pieces over time to start moving this and eventually re-architect our systems so that we end up with a system now that looks much easier to use, so that the marketing data can be easily integrated with payroll finance manufacturing, so that whatever the key pieces are that you're trying to put in place. Now, I say this because, again, this is really where we recognize that if you had your hindsight, and hindsight is of course always perfect, this is how you would have designed your systems in the first place. Unfortunately, they are very rarely designed in this fashion, and so the warehouse and the warehousing technology has become the alternative place to move it. So you're taking large amounts of data from payroll, marketing, finance, whatever your functional pieces are, and putting them into a place where they can be architected in the way that the organization can now start to make use of them. Similarly, many organizations buy software packages instead of building application systems, and this data architecture, again, also represents the glue, the consistency that we need to have, and that we can build the warehousing on top of this in order to make it work properly. So let's get to a couple bits of definitions from our body of knowledge. Warehousing is a technology solution supporting business capabilities such as query analysis reporting and development of these capabilities. The key is that this gives us the ability to analyze information that had not previously been integrated, as I showed in that other slide. If we didn't design them to be integrated, the chances that they will happen to be integratable out of the box are zero. So this represents a fairly big investment for organizations, and often a new set of organizational capabilities of the organization itself asks us to sit down and learn. Now, the concept of business intelligence was actually created by a paper that was written the year before I was born back in 1958, so business intelligence isn't necessarily a technology capability, but it was the idea that if we understood more holistically how the organization worked, we would be able to support better decision-making, and since then we've been making technologies, applications, and practices that allow us to collect and present this information to decision-makers in a way that we could better make use of the things that have happened in the past to try and improve the future performance. Again, we say understanding historical patterns. Many people have simply called this the use of mathematics in business. I think it's a little more detail than that, but it is the way in which it's taught in many curricula. So when we look at analytics, then we try to say, okay, I think analytics is a credit risk. It's fraud. It's a lot of different things. But what really happened is we need to take a look back, and the challenge for all of us collectively is that the vendors are very good at providing technologies to actually build these systems, the data warehousing technology. But what happened is most organizations then say what kind of data warehouse should I build, and they think of Oracle or Peridato, or they think of some of the numerous big data technology solutions that we have. But we really need to reframe the question and say how can warehouse-based integration strategies address the business challenges that we have in order to do this? Now, let me give you an example that we're working on at Data Blueprint right now. We're doing an awful lot of work in the area of what's called palliative care. Palliative care represents end-of-life decisions. And in most cases, when you have a rational conversation with somebody about what's happening to them at the end of their life, the conversation kind of goes like this. Okay, you have a disease, you're going to die. I'm sorry to be downer here, but that's sort of one of the things that happens. And you're going to probably spend the last 10 days of your life using up all of your family's savings and extend your life by about 10 days, and it'll cost you a million dollars. So if you had a choice, rather than making heroic efforts to keep you alive, you may want to think about, let's just call it, going gracefully. And in order to go gracefully, you will save not only your family money, but you will also be more comfortable as a result. Again, I'm certainly not going to present myself as an expert on palliative care, but we are doing a lot of work with the palliative care analytics here. And what they're trying to figure out, first of all, is what happened. In others, we look around at the structured data and we can see profiles by charge narratives and things like this that tell us that the vast majority of medical expenditures happened in the last 10 days of somebody's life. Now, again, I don't have kids, so I can't say I'm going to leave anything to my kids, but certainly some families with children might want to think, well, I'm going to spend a million dollars on the last 10 days of my life, or I could leave that million dollars for the kids so they could actually afford college. That's a decision that some families ought to consider. Then we get into, in terms of palliative care management. We talk about the health delivery organizations and the insurance organizations. They want to ask, well, what's going to happen as people start to do this? We're now finding less of the what we call structured data or tabular data, and we're moving more into the non-tabular data. We're looking at risk profiles. We're looking at pros and cons and giving people all kinds of ideas about how they can start to think about these end-of-life decisions without necessarily making it a threatening type decision, so that when somebody does come to an end-of-life decision and it has a good discussion about it, they can ask, what should I do, which gets us into prescriptive analytics? Again, focusing more on the unstructured types of data. Now, the reason this is becoming important is because it's very likely that California will pass a law this year mandating those discussions take place and that those medical directives be placed in a central repository. This is actually a good move for the overall country. It's a tough decision. It's not something some people like to have, but it is things that people should think about. So I've given you a downer here on palliative care, but what I'm really trying to do is illustrate the different types of analytics and how we use them in this context. Descriptive analytics, predictive analytics, predictive and prescriptive in this case. Some of you also, of course, have heard of a company called Target, a very fine company, and one of the things that Target likes to do is to integrate information in there in order to get a very good idea of what the customers are doing. In this way, Target can present more goods and services to the customers. It's, again, the same type of thing. There are original systems that they built all over the company, have bits of information together, but no way of looking at it holistically. With this type of approach, they can now more holistically look at the customer and figure out how they can be a better partner to the customer in order to do this. The basics of this are by putting the data in a warehouse, and I'm showing a relational data warehouse. This is a very typical approach to it. Users, we use the terminology, can drill. They're sort of mining for data. The idea is that we can make cubes of data, conceptual cubes, and these cubes can be accessible with some general tools so that business people as well as IT people can help to go in and take this transactional data and look at it as it tends to change across time, and more importantly, not just across time, but in relation to other things that happen. For example, we are down here in Florida. The sunlight is not good, so I'm going to bet that the rental car agencies aren't renting as many top convertible cars so that when they look at a weather forecast, they could look and say, hey, maybe we should discount convertible cars during times when the weather is gray because otherwise they will sit on the lot and not make any money for us at all. Again, different types of questions that can come about on the cube, we can ask our revenues the same for all facilities. Again, we can do this whether we are a car dealership, whether we're at Target, or whether we're looking at palliative care types of decisions. We can say for the revenues on the diseases, what types of costs are associated with this and how are we doing on the cube. The cube really, even though we talk about cube as being three dimensions, it really has n dimensions, any number of dimensions in order to do this and because the users can slice and dice this, they don't have to tell us in advance what questions they're trying to answer. Instead, we can look at it and give them the ability. So it's the idea that we're teaching them to fish rather than handing them a really nice fish dinner. Let's give a specific example here. We could take, in this case, the idea that we're maybe looking for customers that would fall within the parameters of income less than $100,000 or customers that are younger than 30 years old. So we grab that particular set. We may use an online analytical processing filter to filter out all the rest of the customers from the warehouse there. And we then ask the list of customers who are less than $100,000 in income or younger than 30 years old. We want those that only live in New York. So we've moved from 30,000 customers down to about 6,000 customers. And then we use a relational online processing filter to look at the ones that have only purchased something within the last 10 days. So we've now gone from 6,000 customers down to about 800 customers. And then our final piece where we can drill across, we take those customers and we want to look at these suppliers who provided the goods and services that these customers with incomes of less than $100,000 or who are younger than 30 years old and live in New York City have purchased within the last 30 days. And we're now down to a list of suppliers that represents just 40 of them. So again, you can see the transactional data would not have permitted you to come up with this kind of an analysis, only the combination of a new business problem gives you the ability to do this. Another example here, I think it's just a final example for this section, is that the idea that banks can give loans. And typically if a bank is going to give a loan, they have a choice. They can make a loan to a person who has a lot of assets and that person who has a lot of assets is not going to pay a high interest rate because the interest rate on those assets is very risky if they consider it to be a good loan. So people will be competing for business, people who will be paying loans back. On the other hand, we could give loans to people who are unlikely to pay us back where we will have higher interest rates that we can get from them, but that the risk of the defaulting becomes much higher given those circumstances. So we can take the bank customers, and again, they're not going to look at it this way in a transaction system, but we can put them in the cube and look at them by social status, geographic location, net value, whatever it is that they're looking at, and give an appropriate balance on our overall portfolio, which means, again, we can lend some person to customers who are wealthy, some who are poor, and allow us to balance our portfolio out overall. So these are questions that can't be addressed by the systems that we built to handle the transactions of the business of the enterprise. Instead, what we want to do is we want to look at them at a higher level of integration. Now, these technologies that I've listed on this chart here really just are a basic set of tools that you're going to be using. ETL stands for Extract Transform Load, which is our shorthand way of saying, you know this stuff didn't, it wasn't designed to work together in the first place, so we have to make these changes so that we can actually make it work together. In addition to ETL, though, we also need to have some configuration management, some change management tools, and in the process of building the system, it helps to do some modeling to answer some specific questions, some profiling of the data, so we in fact know whether it is what it says it is according to the metadata. We may look at some integration-type technologies. Again, ETL handles much of that in some cases. We may want to incorporate reference data management applications, master data management applications, process modeling, business metadata, et cetera, et cetera. Again, each of these are going to be driven by the specific integration goals that you have, so it's important that you take a strategic approach to this, particularly if it's the first time through. Again, what we tend to find is that organizations build these tools and then they evolve over time, so we oftentimes need to rebuild them. Truly, one company that we work with has over 2,000 data warehouses, and they are in the process of rethinking their data warehousing strategy because their business strategy has evolved over time. Now, look at a couple of different combination definitions here. Our friend and colleague, Bill Inman, says that warehousing is a subject-oriented, integrated, time-variant, non-volatile collection of summary and detailed historical data to support strategic decision-making processes in the organization. Sorry. His colleague Robert Kimball says, I think a copy of transaction data that's specifically structured for query and analysis. Again, you see here we're getting to the idea that the original transaction data isn't in the shape we need to analyze it to do the analytics, the predictive, the descriptive, and the prescriptive analytics that we do. The focuses then are going to be on subject areas, transformations, transactions, making sure that we're looking at data that isn't really volatile, but we do have to have that reformation of the data. I'm showing it on this chart here, which we borrowed in from the Infosys Group on this. Again, looking at the systems for determining eligibility, claims, medical records, all sorts of different things up there. And we then do an analysis of where we transform the data from its original shape into something that gives us dashboards, portals, workflows that can be very helpful in the organization overall. Another one from Oracle here, where we show the data sources coming in from the left-hand side, operational systems, including perhaps some flat files and things. The warehouse has the summary data, as well as the raw data inside it, as well as an important component, the metadata that's in there. So the different users can analyze the data, report on it, and mine it for a number of different perspectives on this. One of my favorite people in the world, Claudia Imhoff. We were both at the conference a couple of weeks ago, and she still has an awful lot of work that she's done here on the corporate information factor. We've actually included a little section here on reference at the end of the presentation here for you if you want to dive into this a little more. But it's a very good analogy that allows us to think of this as a production function. Now, again, the difference here, I want to stress this, is not that data becomes something that you consume, but that the raw detailed transaction data enters the diagram through the left-hand side of the paper and becomes reference data that is then integrated and transformed into a series of data warehouses and data marks. Now, the first time we're introducing data marks here, the idea is that a data mark is a subsection of the data warehouse that's been optimized for a specific group. For example, again, in the hospital context, we may have intake data and patient data, but we may want to analyze patient demographics, so we would combine those two previously disparate chunks of data and move them into something that is more easily able to be analyzed, which is the data mark. So a data mark specifically for understanding the analysis that needs to happen. Again, one way of thinking about it, the proper way to integrate this is going to depend on your actual organizational strategic goals that you have while we're emphasizing this as a strategies talk. Another interesting piece, I think this company has actually morphed into something else at this point, but the idea is still very powerful, which is a virtual data warehouse here. So when you look at this, the orange pieces of this diagram are the only physical tables, I'm sorry, this is a bit backwards, the teal pieces of this are the only physical tables and the orange files are actually virtual tables. So if we start on the right-hand side of this diagram, we've taken two tables and I don't expect you to read them in there, but it's this contacting customer. We've combined customer contact into a table that can then be combined with another set of information. The next teal chart up here, that allows us to go in and look at specific chart information that once again is combined into another orange piece that has been combined with three or four other tables here to come up with our final analysis. The idea is that virtually we can prototype the solutions, we can develop solutions that are more or less on the fly, but again remember our definitions of warehousing, these are things that were previously not able to be combined. So again, virtual warehousing is another strategy that you may want to consider on this. This chart tends to scare people a lot of the time. It's actually not very scary, it's a very powerful concept called linked data and the idea behind linked data is we use a much stronger use of standards out there. So if you're storing data, and these are just the names of some of the data collections that are out there under linked data, you actually want to look at it properly, go to linkeddata.org, and they maintain a much more comprehensive set of data that's out there that is linkable. Now what they're doing here is they're keeping this data in a very standard format, so the actual linking of the data becomes much easier. Again, you can imagine, go back a couple of slides and look at the different warehouses with disparate systems where the data was not linked and consequently very difficult to make integrated. Here what they've done, the architecture is very, very different. And they've made data that's accessible in this linked data format. They are called triples. And then again, we won't get into it here, but the idea is that this data is much more easily linkable and virtually anybody with a little tiny bit of programming and understanding of the architecture of the system can go in and collect data from Magnatune with Flickr and again Pisa. I'm just picking three bubbles on the chart. You could pick any of these bubbles because all of the bubbles here represent information that has been stored in a very standard fashion. Now, those are great strategies to approach. However, there's another challenge with warehousing. And this is a warehouse that we looked at at a state in the Midwest of the country where they had a healthcare provider data warehouse and they had 1.8 million members in this. And when we did our analysis, we found that they had 1.4 million providers. Wow, sounds like a great organization. That's almost a physician for every member that's in the warehouse. Of course, you might suspect something was wrong here. And sure enough, of the 1.4 million providers, 800,000 of them had no key, which meant that while we had checked in the data, we could not get it back out of the data. It was absolutely impossible in order to do this. In fact, only 29% of the provider social security numbers didn't have nine digits, which is an easy data quality check that should have been caught by the ETL function on the way in. But again, we can see here data quality was not important to these guys given the situation. In fact, only 2.2% actually had the required provider number that was available to them. But most importantly, all this, the entire data warehouse had one user. And having to build a cost of $30 million for this, this gets to be absolutely crazy. An executive that was asking us to evaluate this made the comment, a handful of MBAs could have accomplished this analysis faster. You never want to be caught in that situation because that gives people the ability to say, my goodness, we spent $30 million to give one user this information. It clearly has some data quality challenges that are associated with it. And really, what strategic purpose were we, in fact, solving by building such a technology-heavy solution to this? Another common perception with all of this is that warehousing becomes this sort of scene at the end of Indiana Jones and the Raiders of the Lost Ark. It's a very iconic scene. What you're looking at is some government bureaucrat taking the Ark of the Covenant and putting it in a government warehouse somewhere. And of course, the idea is this is where data goes to die. Again, it's not the typical thing, but it is a common misperception. And without a strategy, we have very much work to do in order to make sure this doesn't happen to the various projects that we have. So just common causes of warehousing failure, the projects are over budget, behind schedule. They have lots and lots of functionality. It's not built into the system or it's planned for a future release. The users are unhappy. The performance doesn't give us the answers in the time. The availability at the time is not good. It is inflexible so that when we get a new or different business requirement, we don't have the ability to change it. But the data that comes out of it is of poor quality that it is much too complicated to looking at the users in there. We have, for example, too many values of gender code. By the way, male and female is good, but you can know if you're working with the FBI, for example, you need to maintain nine gender codes in order to be compliant with what they're doing. And if you're going to comply with Facebook, I think Facebook has upwards of 65 different gender identities that people can adapt. These lead to incorrectly structured data and you provide a correct answer, but not to the right question. Again, the design is complex. So what we really want you to do is to go back and think about the question in a slightly different fashion. Not from how will we build this data warehouse or, worse still, what data should go into the data warehouse to how can these warehousing capabilities solve this particular business challenge? Because as you can see, we have an awful lot of money that's being wasted resources doing this, and then better still, rather than a warehouse solving this specific business challenge, we'd like to say how can warehousing capabilities solve this class of business challenges here? So there's a whole set of additional questions that we'd like to get into, which is are you ready for one? There are some foundational practices that need to have. Would you get it right the first time? Maybe you should build it in a small series of stages so that you get experience with it instead of trying to immediately jump in. The analogy here that I like to use is that you would never hand a 16-year-old an experienced driver the keys to your Tesla. Those of you that are Tesla owners out there, I'm not, but I've had the pleasure of driving in them. This is a phenomenal car, but it is not one that you want to hand to a 16-year-old. Is your warehouse intended to be the enterprise auditable system of record? That's a very different requirement than we're going to start exploring this technology and seeing what we can do. And finally, how fast do we need the results? Are we going to try to look at these results and come up with something that's actually kind of useful? What timeframe are we looking at? So I mentioned before there were three different foci in this area. The idea here is that each of these should give you a different approach to doing this. And if you do nothing else, if you take nothing else away from this, just remember that at the very beginning of your warehousing exercise, this is one of the more fundamental decisions to make. Which one are we going to go after? Because I will solve more of this class of business problems than I would have in the past. The first one we call the Inman Approach. And most people think of this also as a third normal form, which is the idea that we're going to get some data sources on the left-hand side here. We're going to do some sort of transformation of the data. That's what you're seeing in the staging area. And that the warehouse will then be built with the idea that it will be a source of record, but the users don't interact directly with the warehouse. The warehouse is a very big and complex piece of engineering. But that out of the warehouse, we can derive these smaller components called data march that are more manageable, focused in on some very specific areas. Purchasing, sales, and inventory, showing in this particular example. And the different uses can all be accessing it. The analysis function can access the purchasing, sales, and the inventory cubes, warehouses, excuse me, marches that are in there. And the reporting function can and the mining function can also. So the idea is that we're taking this third normal form and that each attribute in the warehouse is the fact about a key. It's a highly normalized structure. And it gives us the ability to give a lot of different use cases, ranging from what happened in a specific transaction in terms of a historical record to what types of things can happen as we move into different forms of analysis. So again, all of these are very well implemented models. So there's a lot of good literature out there that you can look at. Pros and cons, this is, Inman approach is very easily understood by business and end users. It has the best data redundancy out there. And it's not too much. We enforce things like referential integrity so that the users are able to chomp on the data and to have good solid foundational basis for making the decisions that they have. And it relies on key index attributes that give us very flexible queries. However, if we want to do complicated joins, these joins can be expensive in terms of time and processing power resources. And that it's very difficult for it to scale beyond its original vision that it has. Now we move on to Kimball approach number two, which you can see is somewhat similar in the sense that we have the data sources and the staging. But instead of going directly to a big data warehouse, we go to smaller data march directly. And the idea here is that we have these marchs that are produced with the same functionality, but they are somewhat connected by a warehousing bless that you see in that third column that's there. And the idea is that we're building things in terms of a star schema. We're going more directly to a reporting type of interface. We tend to call these fact tables. So we're looking, for example, at sales here on this picture, and that we have the date dimension, a store dimension and a product dimension, which are the three things that are set up to implement a sale. Very difficult to strive a sale without having good information about what store it came from before this organization, what product it was, and some sort of date in terms of where it's reported. Notice the dimensions already are linked to the ID of the date, but that the date can then be expanded to be an actual date, a day of the week. So we can look at sales on Mondays, Tuesdays, or Wednesdays. We can look at sales by the first month, the second month, or the name of the month of January. We can look at it into the quarters or the first quarter or by year. So these are predefined dimensions that make it very easy to implement this. This gives us the ability, again, to do this online processing in a way that allows us to answer some very standard business intelligence questions without the overhead of the warehouse system that I showed you first. The design is relatively simple. It wants you to understand how to develop star schemas. It can be relatively easy to implement them. It produces very quick results. The querying in this star schema is optimized for this specific type of reporting. So we have a very, very good ability to come along and get results out of it very, very quick. And most of the major database management systems have, in fact, query optimized for star schema designs. So they understand this particular model, and you can predict, likely, what your processing power needs are going to be and how quickly people are going to be able to come up with the designs. A couple of negatives here, if you will, these questions must be built into the design. So if somebody comes up with a question that wasn't part of the original design, you do have to create another mark and another schema or add dimensions onto an existing one in order to come up with it. And most data marks often centralize on just one fact table, which means we're really going to be focused in on a specific aspect of this. Now, the newer of the three of these is the Data Vault implementation. And if you will, the Data Vault gives you that same thing we saw on the left-hand side where we grab the information from the sales finance contracts in there and go to a staging area. But you'll notice there's another component here, which is that we can take the Enterprise Data Warehouse and we can instantiate it, make it fixed for a certain amount of time, and keep in place a series of different predefined roles. Some of the uses of Data Vaults that are here are the idea that you're going to be able to contain some things that are not quite perhaps as homogeneous as they had. So for example, if you want to keep something in mind conceptually, like sales, and you happen to be in a country that had changed currency types a number of times, it'd be very difficult to compare the currency types as we go, excuse me, compare the sales, as we've changed currencies unless we have a way of normalizing that process out. And so the Data Vault allows you to store a series of more or less complex business rules in there at the same time as the data, which gives you the ability then to come along and see how this information can be compared apples to apples, even though your original inputs may be apples and oranges and other things in order to do this. Again, the key here is that this gives you the ability to have a permanent, long-term historical storage, but really with ease of retrieval implementation, it retains the data lineage so we can come back and say at any one point in time, were those sales results reported in euros or pesos or some other currency that we may have had in there. So I kind of put out there's a hybrid approach of eminently Kimmel, but I think the better way to think about it instead of a hybrid, because it's really not sort of a best of both, it's really more when your basic rules have changed. So think of it as a series of hubs that have business keys, the prize of links and satellites in here, and each of these are going to give you the ability to store a certain type of data that was processed with a certain type of business rules. So you're not storing just the data, but you're storing the data and the processing rules that give you the ability to, in fact, get these things back and forth in a way that you haven't been able to before, you would have had to go back and have different types of queries, different types of warehousing capabilities all along the way. From a pro and cons approach, it's a relatively simple amount of integration that you can combine really immense amounts of information in here, but you capture them with the lineage in a way that will give you very, very good retrieval performance. So when your environment is absolutely homogeneous, you've got no problem there, but if your environment is relatively changeable over time, but you can segment it into a series of these, this is a really good approach that you should investigate. The complication, if you will, is pushed to the back end of the system so that whoever is doing the processing has to make sure that the processing and the rule sets are stored in there as well. Not as many people are familiar with it, there are many old people are familiar with timbal implementations, and we don't have terrific support from the ETL community just yet, so if these are big challenges, it's certainly not something that you want to take a look at, but something that you should at least explore. So when we're comparing these three strategies, what we're looking at is scalability, third normal form and dimensional, and data vault, they all tend to do fairly well, but flexibility, the data vault outperforms them from a structural perspective and the ability to be re-engineered. It also excels in terms of its auditability, very, very good ability to do that. Businesses, however, tend not to understand the vaults the way they understand dimensional or third normal form modeling that's in there. The presentation layer of the dimensional vaults are very, very good, and again, very comprehensible to the users. Performance in the dimensional and the data vault are going to help perform the third normal form on a typical design, but that there is much more support for third normal form and dimensional versus the others. So this is a little bit of a challenge, and what we're asking you all to do as the data manager professionals that you are is to think about these capabilities as you're thinking about the design. Now, each of these communities of practice have tremendous amount of reference material that you can look into, so that you don't have to wade through a billion dense books in order to come up with this, and hopefully this will give you a good start. If nothing else, you may be saying to yourself, well, clearly the organization has massive auditing capabilities, in which case the data vault is the best approach given this high-level piece. Again, one of your business requirements may dictate that the other two are not suitable, or you may say, in my part of the country, I have nobody that knows anything about Kimball or data vaulting, so in that case, going with an inman-based approach, maybe the best one. These are strategies that you need to take into account as you're designing these. Now, we're getting back towards the top of the hour here. Again, just to make sure that what we want you to do is not start off at the beginning, because the key to this is that we've done so much work in this area, starting from absolutely zero, or worse still, if you hire a consultant that says we've got to start off at ground zero, you just don't even want to have any conversations with them. Here are four books that are terrific books. David Marco and Mike Jennings put together a terrific book here called Universal Metadata Models, and Len Silverstone put together a book called The Data Model Resource Book, which has universal patterns. This is a three volume set. David Haye put together a book early on. This is probably the first one that was out there. We're talking about data model patterns that are there. And I have a book out called Examellum Data Management, which also talks about these predefined models. Any of these books are good places to start, although from a very practical perspective, both David and Marco, Mike Jennings' book and the Len Silverstone Paul Agnew book come with discs that have Erwin models. So when you buy the book for about 40 bucks on Amazon, you get a copy of an Erwin disc that has models already in place. These models are tremendous places of starting. And again, David Hayes' insight into this many years ago, what he wrote the original data model patterns book, was that, you know what? Counting is a lot like accounting is a lot like accounting. And payroll is a lot like payroll is a lot like payroll. Let's not reinvent the wheel each time. And what he did is he went to the trouble of deriving a normative model. It is not the model that you're going to put in production, but it does, even if it does only have to work for you, you're still way, way down the road. Each of the other books has built on these ideas that we don't want to start off with. It's absolute ground zero. For example, the model from my book that talks about if you're in a situation of mapping some legacy systems into a third normal form model, this is a great metadata model where we're looking at seeing how do we model screen elephants, interfaces, screen elements, by the way. Like I said, screen elements. Data elements, inputs, outputs, for doubts, model views, locations. Again, everything is in here. So you don't have to go back and think about this. Where do I go next? Another group that has done a terrific job on this is what we call the Common Warehouse MetaModel. And this is put out by OMG and just terrific work. I'm starting to get the slide out there. There we go. Where we can go in and look at the warehouses, the processes. And OMG has done a great job of keeping this up to date so that we now have the ability to look at a version three of this same model and say, look, if we're going to build this, a lot of work has been done in the area. Read this book. Read this pamphlet. Read these specifications. And tell me how yours is the same and different from it because it's a whole lot easier to edit than it is to create from an absolute blank. Just put the screen on blank there for a quick second to drive the point home. You'd much rather start off from somebody's place than they've already done. That sort of work. And again, finally, we do get back into David Marko's book and just a screen slide from one of his pieces, David Marko and Mike Jennings, where you can see the actual key. So here's a candidate application here where you can start from. And again, it's not going to be the answer to your problem, but it is such an absolutely ridiculously low-price investment. You are absolutely silly if you don't start off by looking at this. So use these meta models. Just jumpstart your warehousing ideas and see how you're the same or different from how everybody else has gone before in this area. Finish up here with a couple of guiding principles and practices here. And again, the idea is it helps if you have executive commitment. I know that's not the same for all of you, but some people, they get sold bills and goods. We've got to make management understand that this is new to you. You need to grow into it. And if you're experienced at it, you're probably paying a lot more than you need to. And you should really look at this from a value proposition. You've got to be able to access the business SMEs. It is a business-driven application. If you fill a really well-performing data warehouse with absolutely bad quality data, the warehouse will get blamed, not the data. The idea is to build a little bit and then deliver some value and then build a little bit more and deliver some value. And if you can move to a self-service type of model, that will be great. However, realize that one size doesn't fit all. It works for a very large company. It may not work for a small company, or it will work for this industry. It will not work for that industry. But the better you're able to understand what your overarching organizational goals are, the better you'll be able to architect all the way around that. Look at the other things that are happening in your organization. Where I see these things the most wrong is when they set up a separate data warehousing group that is not integrated with the existing things that are being done in the organization. And in mind, and summarize and optimize, laugh not first, because that's really where you're going to find out how the performance of the thing works. Now let me just finish this one section here. We're just at the top of the hour with a little sort of proverbial story that we see happen here, and it really goes to that last piece that I was talking about. Most people separate out their data warehousing groups. And that is, in my mind, just a very unproductive idea. Most people think of their transaction orientation as being happening in this sort of black box that we've got there. Again, as the inputs, the black box, and the outputs, that's our transactions that we have that are going down there. And we take that information and we go and duplicate much of the data into a series of ETL, I can just remember, the staging areas that I showed you before, and then you end up with these warehouse data, regardless of whether you pick Data Vault, or Kidman, or Kimmel, or anybody else. And then this produces data marks and different types of applications that pop out of it, so dashboards and other things on the right-hand side of the diagram there. Again, it's a very, very typical approach, but we're humans. We're not going to get it right the first time. We tend not to ever get it right the first time. So we need to have an iterative response. And the learning and feedback tends to go back in to the duplicated ETL data that's there. And that's really what I want you to think about if you are doing your feedback loop here, you have no ability to influence what's happening in your transaction-based systems back at the beginning of the black box. So from a strategy perspective, think of these things as a more integrated whole. You eliminate that orange line that's on that diagram. And take your learning and feedback and put it back into your data management practices so that you can cut out the process of fixing your warehouse data and fix your production data so that you can really come back with a better approach to this. So again, we're at the top of the hour here. We've done warehousing in the context of data management. We've understood these various integrations. Warehousing is done to solve a business problem. The more focused you are on that business problem, the more likely you will achieve success with that. Understanding at least three different warehouse integrations and warehouse integration technologies and FOCAI. And then finally starting with the idea that there is some specific metamodels that you can use to get started. You do not need to start from a blank sheet of paper. And we're at the top of the hour, and while Stephen isn't here, Brett and I are here, and we're happy to take your questions now. So I'll turn it back over to Brett now. Thank you, Peter. That was great. Now it's time for the Q&A session. Now that you've asked your questions, if you just click on the Q&A window feature at the top of your screen, you should be able to submit your questions through that window. I've been looking at the weather feed, and I don't see anything that has come in should that have said data add. So if you'd like to go ahead and submit your questions now, it would be the time. Here's one, Brett. We've got a question here that is asking about the warehousing method of data baltson. They're saying they've never heard of this particular data balton. I wonder how pervasive it is. It is certainly the newest of the three types that were out there. Our colleague Dan Lindstaff has done a great job of promoting this, and the idea was that either the other two really solved all of the problems in the organization. So, consequently, what we've got is the ability now to solve this. Is this going to be the last one that we've put out there? No, I don't think so. I really think that this area will continue to evolve because it's really hard to predict the future. And if we don't have that idea of what are we going to need as our businesses come out the road, we can't possibly anticipate those questions in there. So hopefully that answers the question about data baltson. It's relatively new, but there is some certification. Again, if you Google Dan's name, you'll certainly see it out there. But there are lots of different other ways of accessing this information on that. There's one that just came in, actually. Two now. I don't know how you could read that stuff up on that. I've got a really good vision, even though my eyes have suffered the consequences. Jeff asks, would the cable approach utilizing a data link or data link provide similar scalability as the typical data bulk model with hubs, links, and satellites? Great question, Jeff. Thank you for that. The idea is how can these be combined in with more modern technologies? And again, what we're seeing here in the data lake is something that people are putting a lot of big data technologies in place. The idea is that we could take much of this data and put it into something like a Hadoop cluster. I say that there are many alternatives to the Doop that are out there. But at the same time, it's the idea that we're using nontraditional technologies to do this. And if we do understand how that works, it becomes relatively easy to come back out and, in fact, create these little data marks. So you're really looking at replacing the staging area with a data lake. Now, what you have to get good with organizationally is, first of all, understanding how to run a data lake, which is not that hard, but if you're new to it, you do have a learning curve that's associated, that your data would then move from your source system into a data lake-type capability, and that you need to know how to go quickly and efficiently from the data lake into the star schemas that you need to have. So I see several organizations do that. I won't say it's a trivial process, but it certainly is a promising approach, and we're seeing a lot of interest in that. Again, I would imagine that some of the publications, they would have some additional case studies on that to look at. So thanks, Jeff. Great question. That was pretty humorous. So I'm just answering. Paul asks, how does Agile figure this data warehouse announce in design? I recently wrote it forward to a colleague's book for Agile Data Warehousing, which is the title. The idea here is to avoid the build-it-and-they-will-come syndrome. So I'm going to flip to a different slide here on that, not to pick on anybody in particular from a slide perspective, but Agile software development is a very good method for developing software. We're seeing tremendously successful results. Let's just pretend with this picture that I'm painting for you here of the traditional data warehouse in the markets and the various bits and pieces of the various screens that come up on the side is what everybody says. And they say, oh, my goodness, I know exactly what the requirements are. You've got to build me. Each one of those one, two, three, four, five, six, seven that I've got on the right-hand side here. Okay? That's a non-trivial exercise. And the chances of you getting it right with our IT statistics being two out of three projects are challenged on scope, on schedule, and on functionality and budget dimensions is very unlikely you're going to get it right. So the Agile approach says, look, let's not, in fact, take the big bang approach to it and let's develop a way of implementing these warehousing technologies using something very similar to what they've learned how to do in software. If I have the requirements correct, I can build a series of small sprints that these sprints can be tested out with the users in real time so that we are more likely to end up with better results faster as we go through it. Again, I'll refer you to the book Agile Warehousing Methods. It's a wonderful approach. And again, Paul, a great suggestion and we probably shouldn't put that in as a strategy as well, if they look. If you're really good at this, traditional roles will work, but if you're not good at it, you should maybe think about prototyping, which is related to this, or if you're getting into it and you have a very quick deliverable to need to have an Agile approach, it's definitely a way to work on it. We're asking strategies. That's our Agile Warehousing Strategy, something we can call it. I'll work it out for you all. We'll put it in the list, and then I'll do you guys. All right. We got any more questions in there? Yeah. Okay. I need to go over it. All right. Group service. All right. I should put it up and do not disturb us on it. Yeah. Next question. Well, let's see. Eric asks, sorry, my eyes are failing me. Alicia asks, we've explained the difference between a data warehouse and an operational data store. Great question. Sorry, we didn't do a good job of that before. So let me get back and get an illustration here for it. So many people consider a warehouse to be an over-engineered operational data store. We don't really have absolute standard definitions for these things, but if I were building this type of a system I'm showing you here for an organization that did not have the multiple dimensions that we were looking at. Again, sales and marketing and manufacturing, others it was not trying to be nearly as comprehensive. The resulting data collection would probably be referred to as more of an operational data store. One of the banks, for example, that we work with takes the loan data that they make and moves that loan data into an operational data store that is different from their warehouse that contains much more of their banking and financial information data, but that the operational data store serves the function of just one focus in Mexico. Many people think of data marks and operational data stores as being somewhat analogous. Again, March is something that's linked directly to the data warehouse, so an operational data store gives you kind of that warehouse-marked combination for a narrower set of requirements with an easier-to-build type of path to get there. I hope that helps us. That's not to give us a shout-back and we'll try to elaborate further. Okay, and then let's see. Tim asks, with cloud-based solutions coming into play, what are your thoughts on moving data to the cloud? And two, how would your vault design change the support? It tends to support this new shift. So good question. Let's take it in the two parts. What's the first one again? It was, what are your thoughts on moving data to the cloud? Okay. So, first of all, I like to say that data in the cloud should have three attributes that data outside the cloud does not have. Now, this is data in the cloud no matter what, but data in the cloud should by definition be more clean. It should be of higher quality. After all, if the opposite was true, why would you do it? So we must be able to demonstrate that data in the cloud is of higher quality. Similarly, data in the cloud should also by definition be more shareable than data outside the cloud. Again, after all, why are you putting in the cloud? It's generally for accessibility. You've all seen those Microsoft commercials where they're running around saying they're supporting the special Olympics, which is a great cause and it's wonderful that Microsoft is helping them out. You can be sure the special Olympics wouldn't want their data in the Microsoft cloud if it wasn't accessible to them everywhere in the world, but more shareable. And that is an engineering term. That means you re-engineered the data to be by definition useful to more business processes and you have it otherwise. And finally, by that same first two, better quality and more shareable, your data in the cloud should be of considerably less volume than your data outside the cloud. Now, let me give you some numbers on that. We talk about data being largely redundant, obsolete, or trivial in the transaction world, which really bullets room who some of you have heard on these webinars before. He's our CEO, a data blueprint. And I argue about it. I see that 80% of the data that's out there is redundant, obsolete, or trivial, and the list says it's 95%. So we're not too far apart, but it is kind of problematic. Now, back to the question here. If you're going to take and say your warehouse instead of being outside the cloud is now inside the cloud, I still hold that my convention is true and it also should be true for the warehouse. So the warehouse data should be cleaner than the data that's outside the warehouse. The data warehouse data should be, by definition, more shareable than the data that's outside the warehouse. And finally, the data in the data warehouse, the reason we're consolidating it is to have less volume than there. So now you get to the answer to your question, finally. Data warehousing in the cloud is a perfectly reasonable alternative, assuming that your organization can figure out how to deal with the issues of data being in the cloud. And if some of the secure agencies that we work with in the government have figured out how to make an Amazon cloud secure for them, I'm sure your organization can come away with a similar type of a cloud and type of an operation as well. So the first part of the question was data in the cloud versus in the warehouse. You can have your warehouse in the cloud, that's fine, and that may make things easier, but the fundamentals of the engineering parts don't need to change at all. They still remain unchanged. There was a second part to that question there was. And the second question is, how would your vault design change to support this new shift? So I don't think it actually changes much. You may have some delivery considerations, in which case, rather than the warehousing design changing, per se, the mark change would change in the Inman space that we're looking at here, in the Kimball space. I think you would actually have your data marks in the cloud, and that would not really change. You may want to put your staging areas in the cloud, because oftentimes it's a cheaper way of implementing it. I'm thinking I have not seen the data vault in the cloud yet. I'm sure Dan, if he's listening, can chime in and tell us something, but there's no reason why you couldn't put it out there. So while there may be some delivery changes, my general concept around the design is the design is supposed to support a business purpose. And the fact that the data is in the cloud versus on the disk in your data center, that actually doesn't make a whole lot of difference. So I would very definitely argue that the design of the warehouse should be set up to focus on the business needs, not on the technology, but I'm trying to seriously type what it's all about. It wouldn't help if I could spell. Okay. He is actually using four-point type. Yeah. When I was much younger, my eyes were much better. There is another question there, actually a couple. Are there starting points, data models, star models, or patterns for either warehouse designing, I.e. employees, start, et cetera? Exactly. And again, that's what these meta-models that we're describing here are. A series of them in these books that can give you the basics for starting in one context here. And then again, Philly, if you don't look at what OMG has done with the common warehouse model, so absolutely, and that's precisely why we want to highlight those, to make sure that you don't start off from the absolute ground zero. It's just a crazy idea that we see it happen. We will go into companies and look at them, and it will be like this. Absolutely. That's exactly what it is. Okay. Very cool. We were thinking of putting data cubes in the cloud and not the entire ETW. Okay. So the architecture description here, then, has that question. I better just say that. That's what he said in terms of reference. Okay. So the idea here would be then that they would take, and where you're seeing the split here between warehouse and data marks, they would be making a decision to say the warehouse is in the cloud because they're closer to delivery. They would actually be taking advantage of the cloud facilities to do delivery of data marks in there and keeping the warehouse time. Again, that may be good, but really the key is, and I think that's what you're focusing on, what are the business drivers that you're trying to do, to try to satisfy what those business drivers are satisfied that way. It seems like a perfectly reasonable approach. It also may be that you try the marks for a little bit, and if they work well, and you find it actually cheaper to move your warehouse into the cloud. Great. On the other hand, your cloud facilities are going to have to have some of those really wonderful warehouse capabilities. And the bottom is getting better, but it certainly is newer. And consequently, unlikely to have everything that you need out there by an order to do it. So it seems like a very good balance between the two. I'm going to take another pass here and see if there is any more than my queue. I think that's it. It's not stealing. It's slowing any further down. Yep, I think we're good. So unless there's any more questions, I'll give you a couple more seconds to type some in there, but I think that's it. We've got a couple more events coming up, Shannon will tell you about some of the issues that are coming up. We've got a couple of events. We'll head back down to the conference room. All right. Yeah, I'll finish up with my little spiel here and I'll kick it back over to Shannon. Thanks, everyone, for participating in today's event. We hope you have enjoyed it. Thanks again to Data Diversity and Shannon for hosting us. Once again, you will receive today's materials within the next two business days. Our webinar next month will be Data-Centric Strategy and Roadmap on January 12th. Hopefully you will be able to join us for that as well. As always, feel free to contact us if you have any questions and thanks, everyone, that has an awesome day. Thank you, Peter. Thank you, Bridge, for this great presentation and Q&A session. Just a reminder to everyone, I will be sending out a follow-up email within two business days, as has been mentioned, with links to the slides, links to the recording, and additional information requested throughout the webinar. So we'll be sure to get all that information to you there. And you can meet Peter in person at Enterprise Data World 2016 in San Diego, April 17th, through the 22nd. Very excited about this year. We have a great agenda coming up. So thanks, everyone. Thanks for being so involved in everything we do and for the great questions and Q&A. And hope everyone has a great day. Thank you. Bye-bye. Bye.