 Hello, I'm Karen Kemp and I'm the Executive Editor for Data Diversity. We would like to thank you for joining today's Data Diversity webinar, Demystifying Big Data. This is the year's January edition in monthly sales called Data Ed Online with Dr. Peter Akin, brought to you in partnership with Data Blueprint. Now I'm going to give you the floor to Megan Jacobs, the webinar organizer from Data Blueprint, to introduce the speaker and today's webinar. Megan. Thank you. Hi, and welcome everyone. My name is Megan Jacobs and I'm the webinar coordinator here at Data Blueprint. We hope that you found the time to join us for today's webinar on Demystifying Big Data. Thank you. It goes out to Shannon and Data Diversity for hosting us. We'll get started in just a few moments after I let you know about some housekeeping items and introduce your presenter. We have one hour for the presentation followed by 30 minutes of Q&A. We'll try to answer as many questions as time allows, but feel free to submit questions as they come up throughout the session. Commonly ask questions. Yes, you will receive an email with links to download today's materials and any other information you request during the session within the next two business days. You can find us on Twitter, Facebook, and LinkedIn. We set up the hashtag Data Ed. So if you're done, feel free to use in your tweets and send your questions and comments that way. We'll keep an eye on the Twitter feed and include any answer to these questions in our post-session email. Now that you're in the webinar, Peter is an internationally recognized data management thought leader. Many of you already know him or have seen him at conferences worldwide. He has many years of experience as received many awards for his outstanding contributions to the profession. Peter is also the founding director of Data Blueprint. He has read five articles and eight books. The most recent is Monetizing Data Management. Peter is one of the most under-data management practices in countries and consistently named as a top data management expert. Some of the most experienced organizations in the world have sought out his and Data Boothman's expertise. Peter has spent many immersions with groups as diverse as the U.S. Department of Defense, Deutsche Bank, Nokia, Wells Fargo, the Commonwealth of Virginia, and Walmart. He's up to conferences and is constantly traveling. Peter, what do you have to do this week? So this week we're just getting the first of the semester started and I've got three classes of wonderful students who are really eager to learn about some of these topics. So let's just jump right in. What we're going to talk about today, and we forgot to tell you guys as we were doing all this, this is 2.0 of this presentation. So we've already evolved past the first one. And what we found is that there are a lot of people that are asking, how do we go about applying this? Not just understanding it, but really trying to put it to work for the organization. So that's what we're going to talk about today, is the idea of what's the deal about big data, give you some historical perspective around it because we are looking at a continuous piece. Nothing is brand new out of the sun. We'll look at specific big data challenges in today's environment for most organizations. And we'll talk about design principles that really evolve around the theme of crawl, walk, and then run. Because this is new, we really do need to iterate and refine our approaches as opposed to expecting to get it right the first time. We'll finish up with some specific design principles that are grounded in the basics of our practice, which are foundational and technical in nature. And we'll finish up at the top of the hour with the Q&A. So let's dive in and talk about where we are at and where we are at is an inflection point in the sense that there's a wonderful McKinsey report that you can find on the Internet. We've got a link, excuse me, a reference to it at the bottom of the page. We didn't actually make it a link down there. And what we've sort of tipped over to where more data is being produced and stored by devices than necessarily by people. And that represents automation at work, which means that it's just going crazy. So in all this, we still want to have the same types of design principles apply, which is to say, what are the problems that you're trying to solve for your business? Because by the fact that you put a Hadoop cluster into your business, if it doesn't fit your problem, you're going to actually create more work instead of less work for your organization. You'll have fewer insights instead of more. You'll have less productivity instead of increased productivity. So data for big data's sake is just not going to solve the problem. You see, there's a lot of companies that are saying, well, I'll just buy some stuff and we'll put it in and we'll get started with it. And that's really not the best way to do it. It helps a whole lot more if you keep in mind the end goal of what business problem are we trying to solve. You may find out that big data doesn't, in fact, solve it, but at least it can't do and what it can't do. Finally, there's a risk aspect to this, too, which is to say that there's a lot of money being spent. And we're going to show you something called the grid cycle in just a little bit. And what we're seeing in particular is that spending a lot of money chasing sort of big data ethereal goals doesn't essentially at this stage of the hype cycle. Now, if you're not familiar with the hype cycle, bear with us for just a few minutes and we'll move on and get to that. But we've put this presentation together around eight myths. And the first one is a very common myth that everybody should be investing in big data. And it turns out that not every company is going to benefit from big data. In fact, there's a lot of medium and small size companies and even a couple large companies that we've worked with where big data is not currently helpful for them. Because it does depend, first of all, on your size. And when I say size, we're talking about the data footprint that you have, as well as your internal company abilities. Throwing hardware and software at a problem does not tend to produce a solution unless it's really a focused effort. So the analogy which is kind of easy for everybody to get is that a local pizza shop is going to get much less value from a big data initiative than a statewide regional or national chain that can actually apply this to some of their problems on this. So the companies have found that there's an advantage of big data following into a couple of categories. And of course, one that's been on everybody's mind for a while is healthcare. Looking at $300 billion a year in potential savings in this area, it's just a phenomenal area and it brings to that there are going to be some things happening both in the processing of healthcare data, but also in healthcare research. Both of those areas are very, very powerful to take a look at. It turns out the EU and in public sector in general, they've provided us some very nice models. So those of you that are in state and local governments as well as federal government take this because they're seeing some varied models coming out of Europe and some of this stuff is really making a difference out there. The international location type of thing where the devices know where we are, we're seeing again enormous expenditures around that. And of course, retail is simply a no-brainer. Retail can benefit tremendously around this as well as can manufacturing and particularly when you look at it from a logic perspective of how are we able to engineer or engineer things when we understand the data. Particularly logistics problems benefit tremendously from this. So big data creates value around these things because the information can become more transparent and more usable more quickly. Now, another way to expose variability and boost the performance of the business processes that are consuming this new category of data, although you'll hear in a few minutes we'd really prefer to talk not so much about big data but about big data technologies. Because this will give us the ability to narrowly segment customers and more precisely tailor products and services so they're focused on specific customer needs in that area. This gives us the ability to increase the sophistication of our analytics and lead to better decision-making processes which means we can develop the next generation of products and services to incorporate the things that we've learned from this improved analysis. Now, those of you that are longtime participants in this webinar series know we like to do polling questions on this. So we'd love for you guys to respond to this one. We'll take about 30 seconds here and Megan, if you'll tell us when we're done on this, we'd like you guys to tell us about what your opinions on these things are. Which is a big data mess? A, big data is not new. Big data is an IT-focused project. C, big data is not always better or big data does not have a real clear definition. We'll give you a little bit of time on that to... This is interesting, Shannon. I get to vote this time. All right. I'm going to come over here and then we'll close it out and see what everyone thinks. Megan, hey. Here's the results. Actually, while we're waiting for the results, we have a few people asking you a definition of big data so they can better understand the presentation and the foundation of where you're coming from. Terrific. We will actually get to that in two slides. So thanks for asking that particular question on here. So it looks like most of you chose B, that big data is an IT-focused project as a mess. We certainly consider that to be a reasonable one. Actually, where we'd like to go on this one is to say that B is a clear definition. And that's exactly the question that your participants were asking, Shannon, on this. The terms used so often and in so many contexts that the meaning of big data has become big and ambiguous. And that good industry efforts often disagree on this. So let's take a look at what they're giving us in the way of definitions. Gartner calls big data high-velocity, high-volume, high-variety information assets that require new forms of processing to enable decision-making. On the other hand, IBM has a definition. Datasets whose size is beyond the ability of typical tools to capture, store, manage, and analyze. New York Times says, should we hand for advancing brands in technology that open the door to a new approach in understanding? And McKinsey comes along and adds to it large pools of data that can be brought together to make decisions and make better decisions as a result. This is our colleague Doug Laney who came up with the theory on these in terms of three Vs, volume, the amount of data, velocity, the speed of the data, and variety in terms of the range and types and sources that are coming out of all this. Many people are adding a fourth variable, variability. Now, if we wanted to get real technical on this, I've got another four Vs to attribute, so we could add to this. Let's stick with just these four for this particular presentation. If you're interested in the others, let us know and we'll certainly include this in the Q&A as we go out of here. So, our question is, wouldn't it be more useful to refer instead of to big data just to objectively determine whether it is or not? And that is big data techniques or big data technologies. We like to do that, and I would encourage those of you that are listening that are having conversations around this to alter your conversation, particularly if you're discussing it with a vendor or somebody who is perhaps coming at this with another perspective, and say, can we agree for the purposes of this conversation to refer to this as big data techniques or big data technologies? You'll be amazed at how the conversation changes when you impose that phrase instead of just the big data around this. Again, if you have questions about that, we'll come back to it at the Q&A. But let's see one of the reasons why for this. Now, those of you that have done these webinars with us before know that we're fond of these Gartner hype cycles that they have. If you're not familiar with it, what Gartner says is that you start out with a really cool technology trigger, and then you go up to this thing called the peak of inflated expectations, which is the top of the roller coaster ride, and you're getting ready to plunge down then into the trial of disillusionment. And after you've done that, again, we went too high in the hype and we go too low in the trial of disillusionment, we finally come to a point of enlightenment and get better as we change it from adoption to exploitation, where we are now at the plateau of productivity. Now, the reason we're showing this particular chart is because last year on the 1.0 version of this, big data was approximately the same place on the hype cycle. Big data is approaching what Gartner calls peak hype. However, they have an interesting thing for us. Last year, on their chart, big data was two to five years away from peak hype. Gartner has now said they're at the same place in the hype cycle at the peak of inflated expectations, but we're now five to ten years away from peak hype. So the idea is that we're going to do more and more hype for a longer time than they originally had thought, and that time is approximately double what the original time planning was. So we're not looking at going into that trial of disillusionment. People are really excited about this. There's some very interesting things that are happening, but at the same time, we have to be careful because we have not gone through the reset that occurs when somebody says, hey, it didn't do everything I wanted it to do, as it will not, and that will become a problem. So just as of 2013, in one year, Gartner has determined that the peak hype has moved from two to five years out to five to ten years out. And when you look at things like predictive analytics and prescriptive analytics, you can see that they are almost diametrically opposed. Prescriptive is coming up the hype cycle and the innovation trigger side of the slope, and predictive analytics is relatively mature and technology. So let's look at some limitations that big data techniques have. And this is the column that David Brooks wrote in The New York Times last fall. Basically, this struggles with social cognition. Big data has no ability to tell you whether you are attached, whether your heart goes fluttering when you have a particular association with somebody, for example, somebody who hasn't seen high school or something like that or a relative that you haven't been able to be in contact with, because data concentrates more on counting than it does on feelings. This struggles with context. It's not sure exactly where it's coming from out of context data. It is particularly problematic and it doesn't have the ability to tell stories by itself. So they rely on us to tell the stories about it. Big data creates bigger haystacks where the haystacks get very, very large and the problem with big haystacks is that we have more false positives because we find more things that look like needles that aren't necessarily needles. Big data has trouble with big problems. If you're on a polarized issue, if you will a side A and a side B, big data doesn't tend to make you switch from side B to side A. That is something that requires a different flavor of analysis. The big data favors memes over masterpieces and you all know that because when you look at somebody watching something on YouTube or MyDamn channel or one of the time-waster sites, it's cat videos as opposed to something that's truly magnificent on that. And finally, all data is obscured because it's presented without a context. It's still a context. Now, these products cost organizations money. We've managed to come up with a set of figures from the average enterprise or the even small and medium-sized businesses spending hundreds of thousands or millions of dollars a year on this. In fact, when you look at the description for how big data is coming on, we're seeing wild spending. 83% increase over the next year is jumping us from $28 billion to $200 plus billion. So we want you all to be careful and don't fall victim to something we call the shiny object syndrome. Again, the only silver bullet is knowing that there are no silver bullets. That is a direct quote from Clive Finkelstein and it's one of the things he taught me early on in the 90s. There's no one technology that's going to solve all your problems. Being a lot of money invested in it but is it generating the expected return in industry-wide figures or in your own organization? And the hype cycle suggests that the initial results are going to continue to be disappointing in this area. To myth number three here, I think data is just another IT project. And in fact, big data is not a typical IT project. It doesn't answer the typical IT questions. It doesn't work within a typical IT context because IT contexts tend to work on very predictable income by this system or develop this software. It will produce these results. Big data is not necessarily about production. It's more about trend analysis. It's more about your methodology and trying to become more agile and more actionable than that. It is a fundamentally different approach. So these big data projects should be exploratory. They should be put in your organization with the idea of finding new capabilities and exploring what those capabilities can do to your organization. It can be a disruptive technology. It sounds kind of simple, but that doesn't mean it's actually easy. And again, beware of the subject syndrome. So let's take a specific example, two examples. We're actually the healthcare and we'll do a retail in just a minute. When we're looking at clinical data and we're looking at the type of data that comes in from a diagnosis, in the past we have relied heavily on the physician for this. Now we can augment the physician with additional data. And there's a very interesting commercial out there by IBM called the IBM Data Baby. You can Google it and it'll pop right up on YouTube for you. And it'll show you what people are thinking of in terms of continuous monitoring. Data Blueprint is working on a project right now for a local group that has to do with wearable paths so that they can determine whether or not a patient is in fact taking their medicine. We're looking at patient demographic data. We're looking at insurance data. And more importantly, we're allowing these big data technologies to integrate information across these previously siloed areas. So prescriptions, pharmacy information, physical data, how much exercise somebody is getting, health history, medical research data. These are all really good examples of a change in focus from what we call population-based healthcare treatment to patient-based healthcare treatment. It means we can now for $5,000 type your genome. And you can see even that number of $5,000 may or may not be the right number to put in place. We'll get to that as well and see what goes on in there. And again, the idea here is that we've got some very good population-based specific data on this. But now that we can start to look at changing customers from not just being a customer of a store, but to being a repeat customer and in fact an advocate for the organization. Amazon.com early on started trying to cultivate people who were Amazon loyalists and who would go out and tell people that Amazon was the place you should be shopping, which is great advertising for Amazon. It, again, plays into customer loyalty programs, retention strategies, and looking all over the place for how the history of these things influences the types of product placements that you do in a retail context. So let's distance our ways from this first little session here. The technology continues to evolve at increasing speeds. Big data techniques are here and we have been used to create insights. Something wisely then allows you to look at your existing information architecture and figure out how big data can complement what you already have. It's not going to solve all your problems. It's not for everybody. It has a lack of a clear definition about the type that's behind it. But at the same time, there is some really useful stuff there. So your question as IT people is to figure out where in such as the ITN business people figuring in your organization these techniques can be used to help augment the things that you already have going. Being aware, of course, that we are headed for the peak of inflated expectations and knowing that there's a trial of disillusionment that follows inevitably. Let's look at some historical perspective around this. Because many people think big data is new. So that's our fourth myth, big data is new. And the fact is, the term might have originated in Silicon Valley in the 1990s. But the concept has been there previously. And we can talk about data sets that are hundreds of years old that it's been used in in the types of context. So it's on the average, the big data, the bigger data sets when you lack the appropriate techniques. But that's what we've been developing these things for. So the collection, which is something called the Book of Bloody, was put together by John Grant in the 1650s when half of the population of Europe was at risk of dying of the plague, and three actually did. And coding allowed us then to go in and say, where is it occurring? So this is a map of the city of London, and you can see the darker areas are areas that we should stay out of, whereas the lighter areas were not. What was it occurring? They were actually in real time able to determine that they had reached a peak in the number of deaths, which turned out to be the peak of the plague. And of course then, what was going to be the motion behind it? How do we in fact figure out what was behind all of this? And by examining different clues, they were able to apply a quasi-scientific approach and determine that they should watch the rats. So when we look at this, we can go back and see volume. There were all kinds of different collection points that were going on. Again, remember one in three people died as a result of this. It was a very short amount of time. So they had all kinds of data collection points that were set up. The velocity, again, weekly. The book that I showed a few slides back was the fifth edition of that book. If you can find such a page turner on that, it's kind of like reading the obits in speed reading mode. The variety, again, who was collecting the data from the points? What were they getting it from? Different people, politicians, others who got it from body burials. Lots of different variety of what was going on there and variability. Social media didn't exist, but gossip certainly did. It was one of the things that they had to use and combat at the same time in order to figure out what was going on with the plague. And when, 200 years later, they had the cholera outbreak in London, they were able to apply these techniques very successfully, although it still took a fair amount of precision. So that's not new. It gives us the same basic foundational data management challenges that we've had all along. And you can see by the examples that we've provided here, the first true health data set from John Grant, the pattern analysis of data, it gave us the foundation for probability, statistics, insurance, et cetera, et cetera, et cetera. Our next topic, then, is what are we looking at today? And our favorite is that big data is innovative. The fact is the big data techniques are innovative, but I had a fellow who works down the street here in Richmond tell us today that he said, I've been working with 2 trillion writers for the past 10 years. Are you telling me I'm not working with big data? And I said, you're right. You have a lot of data, but you're not using big data techniques, and it may be that those would be helpful for you. That was actually the basis for the discussion. Because the ROI and the insights depend on the size of the business, the amount of the data that's being produced, et cetera, et cetera. Again, the local pizza versus some of the national like a Papa John's or something retail, et cetera, et cetera. And let's look at some of the long figures here. Here's a couple of volume figures from an organization. Again, 47 quintillion bytes, 34 billion records, 2 billion queries a day, 65 million tables, 117 billion records, 29 billion queries. These are enormous volumes. And when we look back on this and say volume of the data, velocity of the data, variety of the data and variability of it, it starts to make sense to say, can we apply these big data techniques? For example, for the summer Olympics, and we concluded in London, you could see some of the numbers here again. Gigabytes of data per second, hours of media coverage per day. It's almost a year per day per second that are occurring here. Billions of people walking and devices that are connected. Huge, huge volumes. And by the way, the summer Olympics handled this stuff pretty darn well. The second one is a velocity component. Again, here is a visualization from YouTube. We gave a company credit there. You can see it. They were looking at Johnson and Johnson trading. Now, for a trade to exist in the European Union, it has to exist for at least a half a second. That's a second worth of data. In Johnson and Johnson stock trading less than a year ago, on May 2nd, 2013, it represented, in this case, 1,200 orders and 215 actual trades in a half a second. This gives you an idea of an enormous amount of complexity that's going through this. Variety. I mentioned already the wearable devices. As we start to have these devices to monitor what we're doing, these devices put out lots and lots of data. In healthcare, it's an easy thing to see if you're taking your medicine. But what can we do in terms of predicting? Somebody might say, eventually your smart device may say to you, it's not a good idea to get in your car and drive 300 miles when you haven't had anything to eat for the last eight hours. I'm using some extreme examples here. Hopefully, everybody will appreciate. Finally, variability. Again, this is the history flow for the Wikipedia entry on the word Islam. As you can imagine, a lot of different people have edited the entry on Islam for a lot of different reasons. This is just one representation of how that analysis looks. The big data techniques are innovative, but big data in itself is not. The technologies that we're facing collectively are foundational and technical, and no different from the same challenges we were facing in the 1600s. The technology is continuing to advance rapidly. We're getting a lot of good things that are happening out there. The challenges with big data are not really that new. We have a lot of known foundational issues. We've got a continuous need to align the data and the business with the rapidly changing environment, which we need flexible and adaptable models to be able to put in place. We also have, of course, the standard configuration variables of duplication, accessibility, availability, et cetera, et cetera. And, of course, we have the does business know what problem they are trying to solve? So, as we go forward, let's talk specifically about how we should approach these big data problems. And, again, our advice here is to crawl, walk, and then run. With number six, big data provides all the answers, and, again, as a fact, it doesn't, in fact, mean the end of scientific theory. There have been several articles, including one in Wired Magazine that suggested this. We really don't see that, because you're coming with more correlations, you need to have better understanding of statistics. Again, just because you have a correlation does not mean you have identified causation. Don't go fishing for the correlations and hope that they will be able to explain the world to you. To get an idea of why things are, the motivation, you need hypotheses, theories, and, of course, stories as well. You need to have a substitute for good, careful analysis so that you can recognize an anomaly and explore the deep truth. You need to develop the right approach for your organization. So we'd like your response to this question here now. Have you seen big data provide valuable insights in your organization? The first response A is yes. B also are big data that provided valuable insights. Have you had to implement big data or it's not applicable? Again, we'll give you 30 seconds. Shannon, did you get any more comments in terms of questions that we were going through? No, we'll have some more questions, of course, for the Q&A. Lots of good questions coming in for you for then, but we'll hold it. We'll hold us until a little bit more. Of course, one of the most popular questions always is if people are going to get the recording and answer that, but no, this is going along. That's some good comments coming, Peter. Thank you. For us, we don't know what you guys are going to say, so a lot of no applicable, but most of you or my organization has yet to implement big data, but we've got a good 14% of you that say, yes, we have gotten something out of it, and only a small percentage of us in such just yet, so that's good news. We could even prove the Gartner hype cycle wrong here. That would be lovely. As we move into this next section here, what we're really talking about is what sort of an approach should you do to take a look at how these things are working, how big data can be applied in your organization, and it starts out always with a business opportunity. You have a business opportunity that you're sort of playing and you should really treat it as an exploratory piece, but what we're talking about here is that somebody has said, I think if we did this or had this type of ability to discern, then this would give us a better business opportunity. We want to look at that business opportunity. We want to say how can data leveraging in the context of business opportunity address the need that's there. Sometimes it's from an external marketplace where we're looking at opportunities and threats. Sometimes it's looking internally at efficiencies where we're analyzing strengths and weaknesses. The meeting I had just before the webinar here, we were looking at an organization that had a tenuous connection between orders that were input to the system and how the orders were treated. Were they special orders or were they normal orders? And if that link wasn't there, it was going to be very, very difficult for the organization to provide the excellent service that they were able to provide and had promised people to provide. Now, a little quick note here. We went to six V's on this one. We're really only four V's, but like I said, I have as many as 80 if you'd like, but what we want you to do is then come back against that business opportunity and say, what does it mean in terms of volume? So we're looking at a couple of book users averaging 15 terabytes a day. I think that's a lot of volume. Velocity, 60 gigabytes of data. Again, a variety of devices, but a total of 8.5 billion connected. And finally, we're looking at sponsored data, athlete data, weather condition data. If you look at this here, there's a wonderful data art project and so you can Google it and it shows customer sentiment that was occurring during Olympics. Very, very interesting project. And virtuality, again, largely focused in this case on social media. We say after that, what's the candidate for big data? Based on my analysis, do I need a big data solution or does my current BI solution address my business opportunity? Or four V's or however many we're going to use indicate big characteristics? What are the limitations of my current environment? What are my budgetary restrictions? And what's my current capability with respect to big data? It does make a lot of sense to participate in activities like this and others so that you can find out whether these techniques are likely to yield good business results around that. Now, once you get the yes determination, we then need to specifically and say both technical and foundational expertise are required in order to do this. You can't do it with just one or the other, but both in this case. So let's take a look at the, again, look at foundational practices here. And we just want governance that makes sure that we have implemented correctly the foundational practices as well as the technical practice. So from a technical perspective, excuse me, from a foundational perspective, we're looking at is do you have data strategy? Do you have good data governance? Now the fact that you don't have a data governance organization in your company may not be a problem. You can still implement governance without necessarily having the formality. In fact, a good example of demonstrating the value of data governance is to show how by employing self-governance, you have actually helped the organization do something faster, better, or cheaper. Architects are another foundational practice that you need to have in place because if you can't tell where you're going to implement your big data techniques, then regardless of your environment, it just becomes clunky and doesn't fit and complement. And finally, of course, you need some education around us. Again, these are the foundational practices that we speak about. We also have the technical practices, which is to say that we want to look specifically at things like data quality, data integration technologies, different data platforms, and what type of physical and analytical capabilities do you have in the organization. As you put both of these places, excuse me, these practices in place, we then have the ability to say now let's get some feedback and figure out whether or not we can get the direction or the insight that we're trying to get as a result of this. The insight needs to be actionable and well understood by the business, and it needs to document what we've learned about how this big data technology environment is in fact extending our ability to do good data management practices as we go through what the specific problem is. Again, we don't have the answer. We have some indicators, and we need now to start the exploratory process. This means to iterate. We're not going to promise results, and the results that we're going to get are going to initially be what we call soft. It's not going to be perfect and it's not necessary that they be. Instead, it's a reiterate and refine. This iterative process will eventually get us to a decision point, and we can use that feedback to inform the next exploration as we go through this. This is our framework for implementing big data. I want to talk to you for just a minute about one of the groups that we're working with right now that is in the transportation business, and one of the things that they've done is recently capture telemetric data from their fleet, and they're using this data to go after the thing that most organizations are going after, which is mileage. They are trying to improve the mileage of their fleet. Small changes in their operation are resulting in big save fuel area, but interestingly enough as they load that low-hanging fruit, they also said, could we also use the big data techniques to start to segment our trucking fleet, in this case, out to a number of different categories, and it turns out that the different truck types actually have more valuable information than looking at all the trucks as a whole. Again, this has been an insight to the company, and they are now going back in for their next phase of iterating on this particular approach and trying to find out not just what can they do to improve the fleet mileage, but within truck type fleet mileage. Different things are occurring, and they're getting very, very good results from them. So again, take away from this is very much of a crawl by identifying the business opportunity and determining whether a big data technology solution can be helpful for that. We'll go further forward in applying this combination of foundational and technical data management practices that allows us to document insights and make sure these various insights are actionable, and finally running, recycling and exploring, staying agile so that you can go even further, we may find with this particular analysis we're able to drive it down not just to the truck type, but in fact to the driver type. To do that, we can make the search more productive, which makes everybody happier in the long run. So let's talk about some specific design principles now. As we look at this, again, we have to, most people don't understand these things as foundational design. By God, we learn about this every year and find that virtually organizations operate against a data strategy. If you have a documented, board-approved data strategy in your organization, you're in the top 10 percent of the best practices in your organization. Now the reason that data strategy is important for you, of course, is because if you're just doing data for data sake, it's an interesting thing and something might come out, but if we incorporate the usefulness in the equation, we are more likely to have multiple people working towards a common objective. Again, in the example I gave you earlier, it would be fuel fleet in this case. Because the marketplace is continuing to become more focused, and having that data focus strategy is absolutely critical. Again, your organization has a strategy you probably likely have an IT strategy that shows how IT is going to complement your business strategy. It makes sense to take your sole non-degradable, durable, non-degradable, strategic asset, and have a strategy for managing it as well. It's an imperative in fact that you do this. And as I say, we've actually turned down some tasks because trying to do any sort of good data work at some strategy is a big problem. You must have a data strategy before you have a big data strategy. One of our organizations that in this particular diagram, I'm not going to read each and every piece of line that's out there, but this was a document that was put together as a perfect representing the document, of course. So they had over a billion in revenue. And what they were trying to do was move forward with data with our strategy. And when we came back and helped them with the strategy, they were able to see how big techniques complemented their existing data management technologies. And they were able to move, in this case, from descriptive to predictive to predictive, or taking their capabilities from reactive to repeatable and finally up to an optimized set of technologies. The idea is of course, what can't you answer today? Are you constantly asking, is your management constantly asking questions to which nobody has the answer? Or are those are a good place to start, although you do want to find out, is it a strategic question versus just a curiosity type question? Similarly, is there a direct reliance on understanding the customer behavior driving revenue? If we felt for sure that people were getting really good with data and that they didn't need expertise that Data Blueprint has, we'd fold up shop and go home, move into another line of business. We're not seeing that and most of you are probably not as well. Do you have any information overload? And are you just trying to find the signal amongst a bunch of noise? The example I use here is that somebody, you're listening to the radio late at night and they offer you a prize, maybe tickets to a show of some sort and you're trying to find the radio station's number and it's buried on your desk and you're flinging piles of paper around trying to find this one scrap of paper with a number on it. Again, it's like an organization not being able to respond to a business opportunity and we find this happens most organizations suffer from what we call drought. Now, drought is data that is redundant obsolete or trivial and if we can eliminate that and in some organizations we've seen it as high as 80% it means that you have much less to move around on your desk in order to find the telephone number so you can call in for the tickets. Just asking which is more important, establishing the value from the current data assets or your reporting or exploring big data opportunities and again if it's clearly more important to establish value from your existing data then your exploring big data should be placed in a lower priority temporarily until you're ready to move forward in a mature fashion. So Ms. Devin here you need big data for insights and this is the idea that there is a distinction between big data and doing analytics. Analytics is the process of trying to find insight from your data. Big data is the process of analyzing it. Big data by itself tends not to produce the direct analytics it's a combination of your smart not working with the data. So big data is the technology stack but it can be used to inform predictive and prescriptive analytics. It's an indirect process and most organizations make a mistake of trying to make it their sole input to a prescriptive or analytical engines. We again would go back and say no you want to compliment that because what you want to do is you want to use the existing data for reporting you want to figure out where your bottlenecks are and optimize your existing model. This will give you the ability to understand how your data is structured, architected and stored. If you don't know where it's architected, how it's structured you have very little ability to make any meaningful use out of it. We get to a third and final polling question here now. We want to know from you which method has your organization used to gain insights from its existing data. A. through modeling and architecture. B. through mining techniques. C. through big data techniques or technologies or D. not applicable for you. So again we'll open it up for just a few minutes here and see what you guys think and report the results right back to you. It should be interesting to see the answers given in the last one. Some really insightful stuff going on. There's a website that has a whole series of papers and results from these types of investigations. There's just a wealth of information out there as well as a number of other presenters aiming towards our big conference that we collectively go to called Enterprise Data World that will be held at the end of April where we'll lot more of this information presented as part of the program. It's a terrific opportunity to learn about the basics as well as about the advanced technologies that are coming on. It is the only vendor neutral event of this type that is focused on it and it's being held in Austin, Texas which is a wonderful place to go for a conference. So Megan got some results up there. It looks like 1% through modeling and architecture concentrating on the basics. 17% concentrating on data training and 4% using big data technology. So we're not seeing much penetration of this technology with this particular audience in here which hopefully says that you're getting something out of this. We'll find out when we get to the Q&A section in just a few minutes on this. Foundational practices data architecture I think the last polling results showed this very clearly. If you don't have a common vocabulary anything that you try to do around this will take you longer, cost more, deliver less and present greater risk to the organization. So most organizations have data assets that are not supportive of strategy either because strategy has changed or because you don't know how to do that. So the big question becomes how can organizations not actively use their information architectures to support strategy implementation and that is exactly what we want you all to do is to look at the material in this and other seminars, put it in your own organization and say how can I use this information to support more effectively the organizational strategy. From a series of considerations what we want to say is does your current architecture support big data? If it's BI and analytics it might are you then enough value out of this? Can you either integrate and share information across your experience organization or are there some challenges that you're facing around that? Do you want to go to extract value because it's cumbersome to navigate and access? Are you confident that your data is organized to meet the needs of your business? And we would call specifically as a foundational piece the idea of a data scientist. If you hand a data scientist a data mess they're going to become a very expensive SQL coder and that's not really a value proposition for most organizations. So let's flip over now and look at the technical aspects here and we'll look and look at a couple of them. Data integration first of all again this is the idea of saying we need unified access to data and unifying standards that allow us to create different ways of integrating data. Again the idea here is integrating data across the organizational silos creates insights. Many of you are familiar with the UPS oopsie that they had where when they saw that the crystal volume of package orders were going up and companies were extending their two-day shipping and making promises UPS wasn't aware that their volume, well projected volume was 30% greater than what they had and as a result a few kids were disappointed at the holidays not getting things. Again this is our biggest challenge because big data can be used to help diagnose and integrate the efforts. We can use big data to find out what types of things and give us early warning rather than necessarily reacting to this. So here is that if UPS and Amazon had come up with a big data solution they might have had advanced warning and been able to tell people on the website we're not in fact going to look at delivering things on time. Of course we did for one of our customers here. Many of you are familiar with a technique called data vaults. Now if we take the data out of our traditional environments here so the relational database technologies on the top of this we bring in some technologies, some big data technology such as NoSQL we can actually join to values that are back in the relational database environment. In other words this is a additional layer if you will that allows us to go out and look at things such as invoices passports shell problems that they haven't been able to before. So the advanced data is coming along and we can say oh we don't have enough spaces on our shells or this. When we're looking at data integration again the question is what is the complexity of your data, what are the requirements that you have for integration, and then how does big data fit into that. For example you say we need to be a little bit more permissible of fuzziness, visibility, soft is one of the words that we describe this. Eventual consistency is another one. This integration then becomes domain based on something like time, customer concept, geographic distribution. Things that are important for your business because these requirements have to come from your strategy. Our last technical piece evolves around quality. Quality of course is driven for fit for purpose. If it's not fit for purpose it is not a sufficient quality. Big data is a little different. We're going to really be looking for some basics in the area. We're going to be looking at availability. Again I already mentioned the soft state and eventual consistency here. We're not going to use big data techniques to balance our checking account but we might in fact be able to use big data techniques to determine whether we're going to have a cash flow shortage based on activity that's in the marketplace. Again the directional accuracy is the actual goal. Are we headed in the right direction? What is trending in this case? Of your important data assets should be the root cause of these. Is your data correct when it's first created? Because our experience shows that organizations have had a lot of trouble getting in front of the approaches if they only use a find and fix approach. Again our advice is let's see if you can get behind it and really make it work from the start instead of trying to correct it. I can't tell you how many times we should collectively at the word let's forklift the end of the data warehouse. Let's not do that. Again remember big data is trying to be predictive so nobody can accurately predict the future but we can at least try to see the directional accuracy. You have to always go back and say what kind of questions are you answering because that tells you what level of accuracy it's going to go up or down. I think most of us agree it's up at this point. Are we looking for confidence levels? Do I need to know exactly what they're going to buy or that they're going to move from this department into that department as we move through. Last myth for the session here again bigger data is better and the fact is no we have less data of good quality than more poor quality big data. That allows us to go in and reduce the variables increase manageability because otherwise big data becomes an equation that says quantity over quality and we know that that is not necessarily the best way to work on this. We've already told you before the shiny object syndrome says don't buy it because it's new buy it because it solves a problem that we're trying to do. We want to find out that the solution in fact fits the form of the problem. Big data may not be your answer it may in fact be your problem. So investments around the fundamental and technical aspects will result in better outcome for this. Finally let's get the data platforms here again do you want to measure the critical operational performance process and there's no one platform that does this. In fact our criticism of most of the environments that we've seen is that people have tried to cram too much into one very large thing instead of putting several smaller things together that result in the same performance or better performance and we want to avoid duplicative ineffective platforms on this. If we understand these questions we can now start to look specifically at the capabilities of each of the various platforms and there are a lot of good vendors out there that do a wonderful job with most of this stuff and it works very very well. So when we're considering these platforms what we're really trying to do is to say what is it? Is it column or storage? Is it a different query engine? The stacks look the same until you get into the appliances that have the algorithms built in. Again in the tether, terror data, something like that and ask the questions what are the insights we're trying to get into information or is it historical data that's going to be useful in there and where do we go to find the single version of the truth. We're approaching the top of the hour here now and we're ready to turn it over to you all and get some questions and answers but just in summary big data techniques are innovative but big data in and of itself is not. Big data characteristics have a combination of four or six or eight V's working on it but you can get the general flavor there. The approach that we want to all consider is a crawl, walk and then run scenario and we have some big data challenges that are really focused around foundational and technical data management skills and of course beware of the shiny object syndrome. So with that we'll turn it back over to Megan. As usual we've given you guys some references and look at some of these. Most importantly the big data McKinsey report is a really excellent one in the Gartner hype cycle as well and we'll move on to our Q&A section. That was a great presentation. Now it's time for the Q&A, time for you all to ask your questions so just click on the Q&A window feature at the top of your screen and you should be able to submit your questions through that Q&A window and we can go ahead and jump right to it because we've had a lot of questions coming from you guys, great questions. So let's go ahead and start now. The first one is we hear about SQL, NoSQL and Hadoop. Is there a single integrated repository for both structured and structured data? If you listen to the vendor the answer is yes but just the fact that a tool will allow you to put a repository together that will handle your structured or what we like to call tabular or non-tabular data it's going to solve your business needs. So yes, there are techniques and technologies that will do this but the question is will they work for your business application and that's a very different question that we need more in-depth exploration before we can come up with a definitive answer on that. And judging that so again look into your organization see what you're trying to get and then look what the vendors are offering. Can you provide examples of healthcare big data that have achieved results of improving healthcare outcomes or reducing healthcare costs? Costs in particular. So one of the things we're looking at as a series of projects is that we're investigating right now about half the cancers that actually occur worldwide much less the fact that these are occurring in the U.S. because about half the cancer treatments are occurring outside of hospital situations we're treating them without patient treatments. So one of the things we've been doing is using big data technologies to go through and mine the billing information that comes from various hospital based ERPs the Cerner and the the other one is right off the top of my head but we're using technologies to go through and find more instances of cancers and when we find these instances of cancers we can include them in our studies in a way that we weren't able before. Again imagine trying to study something like cancer and having only half the data accessible to you. Now once we've got these in place we can actually go through and try to figure out what are some of the drug values that are occurring. So when treatment occurs again instead of having to rely on a specific piece we can now look at generalized trends with much greater data. Again it results in that same thing I said earlier more haystacks with more needles in them but with quality statisticians and data scientists on the other side of it we can quickly go through these and find the actual needles in the haystacks that are there. So the theme in healthcare is moving from population based treatment where you say take this drug and it'll help about 40% of the patient and instead exploiting somebody's knowledge of their genome and saying this drug won't help patient X but it will help patient Y. Very powerful stuff there's some work going on here at Virginia Commonwealth University. I'd be glad to point you to it but you can actually just navigate over to the Massey Cancer Center website and you'll see some of the things that we're talking about on that. Okay. Let's see here. Lessons learned from big data implementation and exploration. That's hard. If you Google that exact phrase you will find a lot of responses to it. So I'll talk from at least our perspective which is that we see a lot of organizations who come to us and say look I've got an extra million dollars at the end of the year. Should I invest in big data or should I invest in something else? The answer to that is of course it depends on what your strategy is. So what we really see is much investment in these big data technologies that is absent of a strategy. That's a very expensive way to go out and explore. Again just imagine taking the car and going on a drive with no particular destination. You'll use up gasoline, you'll put wear and tear on the car. You might encounter weather conditions that might be hazardous. It could be a problem whereas if you say I'm going to drive to the store to get some milk in return you have a little bit more world-defined mission and we're seeing exactly the same thing in this big data space where companies are buying these technologies without an idea of what to do with them. One of the conferences I was at this fall was IBM's Information on Demand Conference. It's a ginormous conference. 16,000 people in Las Vegas. It's very, very hard to make sense out of it but one of the things that was very clear is that IBM is very concerned that their customer base doesn't really have a good appreciation for how to use their techniques and in fact some leaders that I spoke with in that conference were saying yes, we're going to be coming to your event in Austin because we need to be able to talk to people and you can't talk to 16,000 people at once but you can certainly get some good groups together in a group that's around a thousand. So they were coming to us to find out what was actually happening in their trend analysis. How does an organization get started with big data, particularly when business wants to get going and not left behind? I'll leave one more cost in there too that you didn't mention in your very, very good list and that's the opportunity cost because you don't want to have people doing things in big data at the expense of not doing things in another direction. I think that's the best way to do things in another direction. Let me back up to our framework slide which I think will be helpful. It can be just a second to move. There we go. The idea is that if we're able to come on slide, there we go, use this type of a framework to we need the practices and the technical practices, big data technologies or I guess what the rest of us have been doing all along which has been little or maybe medium-sized data. Again, it's sort of hard to figure out exactly where that line is. But if we look at started on this, what we're really trying to do is the same cycle we've always done which is to identify business opportunity and see what technologies we can use to do that. We've worked with a number of organizations that have gone off and said we really need to get into big data and we said we want that. We said we have all these business problems that we're trying to address. When we do an examination of those business problems, what we find out oftentimes is that we can solve those business problems by improving their capabilities around the foundational and technical practices. When they have a good idea of where they are weak and where they are strong with respect to their technical and foundational practices, instead of buying a big big data implementation, they can go in with a small big data implementation. There are cloud services out there, for example, where you can get in and rent some of these components on the cloud and try them out for a very, very low cost. There are places that you can go and put experiments that don't detract your people with another development effort and learning another set of technologies. But let them concentrate on what they're really interested in, which are the results. Again, what we're advocating here is a crawl, walk, and run approach that says big data techniques may or may not be part of your solution, but the question is what is the business problem that you're trying to solve? Will these things work in that area? It's very important also to have your own staff educated a lot better about these things because outside of groups like Dataversity where are they going to get it? We have very few schools, for example, that are teaching big data techniques outside of a computer science course that says, hey, here's how you stand up a Hadoop cluster and work on these things or here where you should put MongoDB because the question is not do you need Hadoop or MongoDB, the question is what is the business problem can you solve? And will Hadoop come to a faster, better solution in that area? Very, very important distinction. All right. Cloud technology considered big data. So in the sense that if you look at the technology stacks, they would like to keep them separate, but there are certainly high degrees of complementarity between the two of these. Again, Amazon Web Services, right, pioneering group in that area, started these things out with the expressed idea of being able to expand and dynamically contract these types of things. So again, variable amounts of service and things like that. So cloud has tended to be focused on infrastructure related and moving things into the cloud that are things that are better handled by experts than necessarily by your own individual staff. So you can use cloud in conjunction with big data techniques and you can use big data techniques without them. The one thing that we can objectively define is what do we mean by a big data technique. That goes back to our 1.0 version of this presentation. Remember, this is the 2.0 version of it. You can still get that on the Web in our archives and download it. But we talk about the difference between von Neumann and non- von Neumann processing. And the idea was that our old way of processing the von Neumann model was that you took data to a CPU. Non- von Neumann processing exploits our malability of taking the processing to the data and exploiting parallelism and other types of non- von Neumann architectures to come up with these different results. As I said before, you're not going to want to balance your checking account with this kind of an approach but you certainly may be able to forecast cash flow needs for an organization on an ongoing basis which may or may not be important depending on which business context it is. Question is where did it go? Is there any language to work with big data like SQL to relational database? Developments that are working in that area. As you know, SQL is very popular because we've taught it to high school and university students for years and years. And there is an established base of SQL knowledge and because it is a quasi-standard, it means that organizations can hire somebody with knowledge of SQL and have very good results in that area. What we would like there to be, there are some approaches that are starting to move into that direction but right now SQL does not work with big data. You can however do some hybrid processing of it so that it helps to move your piles into different areas. Again, using SQL to put things in categories. I remember the example I gave in the transportation company about a half an hour ago. Their initial pile of log data that came out of their transit units from the NCAB responders was definitely a big pile and they looked at it and said what can we do to approach improving mileage around that because that was the biggest problem and challenge that was facing them. It was their data strategy to reduce this mileage piece. As they went through it and did the analysis, they found out that they could do it by fleet. Now they are actually going down and finding out by driver type and we are also seeing some geographic iterations. Again, travel in the northeast corridor has very different characteristics from somebody who is traveling from one end of Kansas to the other. Again, these are all important variables and it is a good example of an organization going through this crawl, walk, and running stage. They have certainly gotten beyond the crawl stage and are now learning to walk and getting ready to start running. It is a big data struggle of context. How does an organization evaluate the originality and validity of the data available to them? It is a question and the idea here is that with big data it is even harder to pin down. Where did this come from? Some of you may have seen as Stephen Colbert made a tweet bot that responded to somebody else's tweets. So every time somebody tweeted, they had a machine that went back and tweeted in response to that. Actually, I think he reported on it. I don't think he actually did make one up. Anyway, the point is all of a sudden the quality of the tweets that are out there are suspect. You have to go back in and say we are actually in fact grabbing legitimate data or is it somebody that is out there spoofing the system because we can make machines, bots, that will produce this stuff very, very fast and inexpensively and in fact kids are doing it as part of their high school programming assignments these days. So again, consider the sources. I think the message that you take away there and big data can help with the sources and at the same time it can be problematic. All right. Let's proceed. Do we know which types of devices are being used and which devices are being used most? Again, outside of a specific context, what you're seeing at the moment are the smartphones that are producing really interesting pieces. As you've been reading in the newspapers with the revelations around Mr. Snowden, for example, many of you probably aren't aware that your cell phone, your smart phone actually has enough information in it to uniquely identify your individual gate. So, Megan, the folks that are watching over our shoulders can tell whether you are carrying your smart phone and you may or may not be comfortable with that but it leads to some good things and some bad things. One, we can tell whether Megan is carrying your phone or whether, if Megan is carrying it, whether we need to authenticate further than that. So, one of the things we are likely to see in the near future is additional authentication mechanisms that come about and are able to, in fact, produce very specific information on who's carrying the device and if somebody is carrying the device, how they're using that device for access systems. Yeah, just a little bit more. One is, what is machine learning? How do we use some big data analysis? That question sounds like one of the things we ask PhD students to come up with. So, machine learning is the idea that we can learn algorithms and have a colleague that's working on one of those right now as part of his PhD thesis. It's the idea that we start off with a certain set of assumptions and we go forward and see if we can improve the accuracy of those machine learning functions. It's typically done via a data mining start and then coupling it with something else, which is additional sensory data or other things that come in. And then we take and make it better. The algorithm eventually learns to predict, for example, milk prices depending on temperature or the number of chickens on a farm based on the weather, et cetera, et cetera. And these now become more and more useful as we move forward because the system gets smarter in the process. Google Machine Learning, you'll see a couple of good textbooks written by our colleagues here at VCU that actually do dive into this topic in a lot more detail than we can get into here. Do you have any recommended readings for big data techniques? One of the references here, one of the books I would start with though is the first one that's on this list. It's called the human face of big data. It's a big book. Physically, it's a large book, but it actually gives a lot of really good examples around that. And then I would go also to the McKinsey reports because it's available online. McKinsey is very happy for you guys to read that. Those would be excellent starting places. After that, it becomes a little sketchier in terms of the articles tend to focus in on specific niches. One of the things we're waiting on is the folks that did the Summer Olympics are supposedly compiling a guide that was supposed to be used to prepare for the Winter Olympics which is, you know, we're getting ready to start. And it'd be real interesting to have a copy of that. I have a former student that's working on that project, has for actually the last five Winter Olympics and I'm hoping that he's going to be able to slip us a copy of that so we can learn what they learned by using big data in the Olympics. Do you think we'll replace SQL tools slash language? SQL is, of course, a unified way of accessing relational databases. So one of the reasons it is as popular as it is is because it's relatively easy to teach because it involves relational calculus and set theory. Until we have some similar constructs around big data, I think it's going to be more difficult to come up with a sort of universal approach to guarantee you there are a lot of smart scientists working on it right now. We have a need for that. It's something that all businesses need. And I think there will be some approaches. I've seen some preliminary reports that indicate that some of the results are in fact promising but aren't yet on the market. Somebody is taking advantage of big data. Well, our little survey here most are not just yet. But when you say taking advantage of it, those that are exploring it, how they're using it, there are some very expensive lessons having over-invested in it and discovered that the sort of soft data that they get out of it is not terribly helpful. On the other hand, there are a lot of companies who are using it to tremendous advantage and trying to figure out. Again, I'm going to go back to the five industries that I referred you to before. If you're in retail and you're not looking at big data, shame on you at this point. If you want to look at big data internally to look at big data, look around at your partner, people that are doing it. You'll probably find somebody who's exploring it and offer to help them out. Maybe swap them some data in terms in order to swap for some results in there. The organizations that are getting the best results are not getting it purely from big data. But in fact, using big data to complement their existing BI and analytics initiatives and helping out with that. Again, we're seeing a lot of people around in that area. What example I guess we can talk about is the Google flu. I think most people are familiar with it. Google has some ideas that they could out predict the Center for Disease Control in terms of predicting flu outbreaks based on people looking for things like Kleenexes and sneezing nose and cold symptoms and things like that. Very, very nice effort. It came along very, very well. And they actually over-corrected. They really predicted a much higher volume of flu because they tweaked the algorithm with the anticipation of it. When they compared the results of the algorithm to reality, it didn't work out nearly as well for them. What we're looking at again is this blended approach. Let's take the traditional ways and see how big data techniques can augment to the things that we're already doing well. Answer some questions that we're not addressing instead of putting a bunch of people in the room and saying, go off and figure out what you can figure out, because they'll probably spend half their time duplicating your existing efforts. And that's not a very good value proposition for most organizations. It's hard to build business requirements for big data solutions. Hi. Same place you build your big data requirements for your solution, which comes from your IT strategy, which comes from your organizational strategy. Now that said, to give you some more specificity around that, what you want to say is, what is our current capabilities? What are our strengths and weaknesses that we have with respect to our existing data? Are we answering the questions about our customers that we want? Are we looking at our internal operations and seeing that we can reduce costs, streamline efforts, smooth things out in one way or another, better with these? And then with my questions, in context and say, can big data techniques help to address some of those questions? I doubt seriously that any big data technique will address those by itself, but in conjunction, these things can become very, very useful. The thing to think about is sort of a symbiotic relationship, and you may be familiar with this concept where two things working in concert sort of feed on each other. It's a true cycle, if you will. So more of this gives you more of that, more of that gives you the ability to go back and do this. Again, the cycle that I gave you relative to that transportation company is a good one. Okay, let's use big data techniques to try and reduce the fleet cycle. That works well for a certain level, but if we fine tune the analysis based on our existing data management practices and say, let's not look at all of our fleet, but parts of our fleet or categories of our fleet or classes of our fleet, then we can get better results, and in some parts of the country, it's geographically advantageous to look at specific drivers, who in fact perform better in certain geographic regions. So this driver should be restricted to East Coast driving to make them more effective or maybe to Long Hall out in the West. We've had some questions, but I think that is pretty much it for now. Thank you everyone for participating in today's event. We hope you have enjoyed it. Thanks again to Data Diversity and Shannon for hosting us. Once again, you will receive today's materials within the next two business days. Our webinar next month will be Data Centric Strategy and Roadmap. Hopefully, you'll answer that as well. As always, feel free to contact us if you have any questions. Have a great day. Good questions. Love it when everyone gets involved. Thanks Peter for another fantastic presentation. It was amazing. Great way to start the year. Thanks to our attendees who are so interactive and just make it valuable with the questions and getting it to a level of education out there. We hope to see you at Enterprise Data World in 2014 in Austin, as Peter mentioned. Hope everyone has a great day. Thanks for your time. Bye.