 The Cube at EMC World 2014 is brought to you by EMC. Redefine VCE, innovating the world's first converged infrastructure solution for private cloud computing. Brocade, say goodbye to the status quo and hello to Brocade. Welcome back, Jeff Rick here on the Cube Day 3 from EMC World 2014. We launched the Cube at EMC World 2010. It's our fifth year in a row coming back. We're so excited, we have such great demand that we have not one but two cubes so we can get twice the number of guests on, bring you more insight, more of what's going on. And I'm joined in my next segment by Cube favorite, Bill Schmarzo, dean of Big Date. I don't even know what his official title is. I'm sure it's up on the lower third. I've had Bill on a number of times and Bill launched his book on the Cube at Big Date NYC last year. But Bill spends more time on the road than most Accenture partners. I mean, the guy is always on the road but the great thing is he's talking to customers. So Bill, welcome back to the Cube. Thanks, Jeff, thank you. So what are you seeing out in the field in all your various travels? So it's fun to be in the field talking to customers all the time and there is increased belief that Big Date can actually change my business. The conversation is really starting to shift one that's focused on technology being led by the CIO to a discussion that's being led by the business with the help of the CIO to focus on how do I use Big Data to transform my business? How do I use Big Data to make me more money sort of stuff? I like that, how to make more money. So talk about, obviously it's a journey. Probably a lot of people don't even know where the data is much less how to get it in and work with it, bringing in third party sources of data. So what are you seeing in terms of people having success, where are they starting? Where's the little hanging fruit? How should they proceed? So that's a great question, Jeff because I think for most organizations the low hanging fruit starts with what are my key business processes? And for many organizations that is their existing data warehouse and BI implementations are actually a really good starting point because the BI data warehouse stuff has defined your key business process. It has identified what metrics are trying to measure what data I'm using, it's built reports and dashboards it's done some of the transformation work for you. So it's actually a really good starting point and then we find that the low hanging fruit for most organization is their existing operational data. It's the dark data in the organization that they have purposefully never had access to. If you think about a data warehouse environment is really expensive. So what most organizations do is they actually minimize data, right? You don't put all data in a data warehouse you minimize it, right? You don't want to buy more boxes, right? You don't want to buy more boxes and so you keep 13 months of aggregated summarized data in your data warehouse when all the insights is in the 15 to 20 years of detailed customer transactions that you have but probably have stuck on mag tape somewhere. It might as well be chiseled in stone to have it on mag tape. So how are they, so do they see enough value to one, collect more and two, not summarize it as much but keep it more in its raw form? Yes, so we have to take a lot of our customers through an envisioning process to get them sort of to kind of slap them inside the head and say, okay now take off those existing binders you have that said, oh I can't have all the data and we start running these envisioning exercises to help them understand here's what you could do with you have all this data. Here's the kind of questions you could ask that you could even ask in the past and here's the kind of answers you can get that are much more insightful and detailed than the ones you had before. When customers go through that, you can literally in the process you can see the light bulbs go off and people are going, wow, I didn't know I could do that. Well, if I can do that can I do this too? Right. And when they start that what if-ing process you know they've made that key leap. So I wonder if you can share any specific examples that you've seen out there of customers and some of the initial successes that they've had. Well, I tell you my favorite one is this school district project we're doing in a very southern tip of Texas and they want to create what they're calling Netflix for teachers. And what they want to do is a teacher, the problem they have is that they have a real hard problem with teacher retention. It's in a very, you know, it's way south in Texas it's kind of an underprivileged area and so teacher retention is a real issue and they find that their teachers are spending between 45 to 90 minutes a day on administrative tasks not teaching but doing administrative tasks. So they want to build this thing they call Netflix for teachers that when a teacher comes in the morning they open up their tablet it shows it you know what students are struggling it has recommendations as far as what lessons they should go through identify a student who's having behavioral problems who might need an intervention it'll even help a principal identify a teacher who's struggling who may be teaching the wrong class or dealing with too many students. And so the idea is you deliver those recommendations they can pick and choose the ones they want and you take that administrative low down from 90 minutes down to 10 or 15 minutes and let them spend time teaching. Now is that in process or has that been delivered? It's in process right now. So we've done the envisioning exercise the one thing you probably get a kick out is our data science team was looking at all these different data sources we had all this lesson data from all their lessons now are done on computers so we can pull off all the logs and analyze things and have their course or test results and one of the data scientists said I wonder what if? I love that, what if? What an impact is on the price of a parent's home and the change in that price and the student's performance in class, right? If the housing value has gone up do we see a corresponding change in the student's behavior? So we brought in Zillow data, right? Which not only tells you the price of the house but in a monthly value it shows you the changes and we're trying to correlate is there a change in that or not? Now we haven't found anything yet but I love that kind of thinking that says what if, let's try this data let's grab it, drop it in here play with it, found nothing? Okay, found nothing, move on to the next one. I love that kind of exploration attitude. And you went through a whole bunch of things that you're looking at. Were most of those data sets already someplace they just weren't tapped into? Yeah, most organizations have data buried throughout the organization the classic in silos and in many cases they're not even sure how much data detail they have. Though the IT group themselves struggles with where's all my data and the business people by this time have given up even trying to ask for it. That's probably the biggest challenge that the business users face is in some cases they've given up asking the questions. Now is that because just the you know that which gave rise to shadow IT just that the IT guys were just so busy or the delivery was so slow and they've just been frustrated by it or is it just that no one could really answer the question because no one's really thought about the data in that way before? I actually think Jeff I think it's both. I think part of it is you know building a classic data warehouse where you have to define your scheme of first and then load data into it has always been a long labor intensive process and it's not unusual somebody wants to add a new data source you know the IT says well I'll be six to nine months. Well holy cow the opportunity may be gone in six to nine months it may be gone in six to nine days. So the traditional data warehouse approach of having to build that scheme of first and then try to retrofit data into it has really made it very hard for people to try to bring more data in. So IT groups have, they're doing the best they can with the technology they had but it's kind of like making a dog walk on its hind legs. It's impressive that they can make these damn things work but I don't want that dog bringing my beard to me. So one of the things you mentioned earlier was again going from a mindset of having basically as much raw data as you can versus having aggregated summarized data for short period of time because of the cost. Are people doing some type of ROI analysis to justify now collecting all that data? Are they putting that data in a less expensive place? How are they managing the business expense on the front end of that process to then eventually unload the insights of the bag? Well what I think is the most interesting thing about big data is the economics of big data that the cost to store, manage and analyze data is 20 to 50 X cheaper today than it was even two years ago. I have a friend. 20 to 50 X in two years. Let me give you an example. I have a friend who's a senior vice president of analytics at Travelers Life Insurance at Hartford. He did an analysis, he shared an analysis with me, he said for the same four terabytes of cost that it cost me for data on an enterprise data warehouse four terabytes, I can get 200 terabytes on how to do. When you have that 50 X magnitude of change it can enable you to think differently about how you treat data. It's like data doesn't become a cost to be avoided. Data becomes an asset to be hoarded and gathered and harvested. So there has to be that kind of mental shift and the economics are allowing us to make that mental shift. And how much of this economics, how much of its competitive pressure from other companies within the same realm or has that really gotten enough traction yet? I don't think there's as much competitive pressure as much as there's a person in the organization who has this feeling. I don't even say a vision. It has a feeling that we could do more with our data. We have this data, we've been spending all these years collecting that and we're just not getting great value out of it. Many times somebody in the CIO's organization is saying, God, for the longest time we've been getting whipped up on. We have all this data. It's time for us now to be in a leadership position, become a strategic business partner and help our business cohorts to leverage that data to get more value out of it. So I don't think it's competitive as much as it's somebody in the organization who's starting to realize, wow, we can do something different. Right, we talk a lot about CIOs starting to take more of a business role in the organization, not just maintainers of what was in the past, but really thinking about adding value beyond that maintenance role and bringing strategic vision and execution and competitive leverage. So do you think big data is one of the better ways for them to take on that additional responsibility? It's the perfect opportunity for CIOs who are willing to step outside their normal roles and who aren't afraid to fail and to step forward and say, we can help lead this initiative. Now they got to have somebody in the business who's willing to walk arm in arm with them. And in most cases, a business wants this, is dying for it. They just want somebody on the CIO side to say, I'm in, I'm all in on this thing. We're going to make it happen. And can build that collaboration between those two groups that really makes the big data discussion not a technology discussion, but a business transformation discussion. Great, so let's shift gears to the people side. So we've talked a little about the CIO side, but it often comes up that there's a shortage of data scientists and really to potentially unlock the real value of this. You need the guy who's not a data scientist that you give a little taste of something to and he goes, ah, if we could do this, can we do that? So what do you see in terms of trends of not data scientists being able to execute new things with big data gain insight? So the tools today that are available for data scientists are really still very specialized. And it's not like the average person is going to be able to pick up and write Python, MapReduce onto a Hadoop environment. It's just not there. And the people who have been trained in BI for a long time have had to sit on the sidelines, but now we're seeing things like HAWC that allow people to write SQL and get that data stored on an HDFS structure. So we're slowly starting to bring in because of the technology that the SQL people who can issue queries. However, the data scientists have the ability to be curious, to bring in data sources, to write these Python and Ruby on Rail and Java code that brings data in and literally is building the schema on query. That's still a very rare breed. The tools haven't helped that role yet, no, but education has. There's, you know, EMC has an education program where we're training people, we're working closely with a number of different universities that share curriculums. We see opportunity to create more data scientists. If we're smart, we tell our kids to become data scientists. We'll have jobs forever. Yes, they will for sure. So that's interesting. So we're here at EMC World 2014. What do you think we're going to be talking about a year from now? What do you think is the next big kind of move? So I think, first off, the data lake discussion is going to become not a preliminary discussion about, I wonder if we should do that. We're going to see company after company who've already jumped into the lake. Lake or ocean, this is a big argument that Furrier hates that lake. Of course he does. It's an ocean. Well, I just wrote a blog about the data lake and I titled it something like How I Learn to Stop Worrying and Learn to Love the Data Lake. I'm going to take off another strange love. Actually, after talking to a lot of different customers, I actually think the data lake is going to be a great enabler. You know, it's going through its overhyped status right now, but a year from now, we're going to see a lot of different organizations who have implemented the data lake and are running not only their analytics on top of that, but in some cases, they've actually moved some of their data warehouse capability to that as well. So I think the data lake is going to be a very big enabler. I think next year, you're going to start hearing a lot more about the third platform as a delivery vehicle for the insights I'm teasing out of my data lake. Then the third platform, being able to deliver insights to my frontline employees that they're more effective in working with customers. They're more effective in the stores or underneath the mechanic in the shop. I think we're going to see the third platform take on a growing role. By the way, not only the third platform, but the role of a user experience person, I think we're going to start seeing more discussion next year about the importance of that role. Awesome. And just before we close, I want you to give a little plug for the book. So Bill wrote the book, like I say, not only is the Dean a big data, he wrote the book on big data. So I think they're giving away a few copies here at the show. You can get it on Amazon. But I think the story of why you wrote the book is interesting. And I wonder if you can just kind of run us through that real quickly. Well, the book was, EMC wanted me to write a book on big data and got me in touch with Wiley. And so Wiley kind of, they made things easier. I had a blog, but the real impetus for doing the book was I was sitting around the kitchen table with my kids and my wife. And I was telling, I said, I was thinking about writing this book, but it's a lot of work. And I talked to my friend Ralph Kimball and Ralph says, you don't make any money writing a book. And so I said, I don't really do it for financial gain. And I think it was Amelia who said, she's dead. Why don't you donate the money to charity? That's like, well, which one Amelia? And she goes, well, grandma died from breast cancer. Why don't you donate it to breast cancer research? So the proceeds from the book, I donate to breast cancer research to give back. I've always believed in, I've been very lucky, right? My whole career has been full of forced up moments where I've been at the right time at the right place, where people have freely shared with me and given me opportunities. I just want to give opportunities back. Awesome. So Bill, thanks for coming on theCUBE. As always, we go out to the events. We extract the signal from the noise. And as we say, we get the smartest people in the room we can find. We invite them on theCUBE. We ask them the questions you'd like to ask them. So Bill Schmarzo, thanks for coming on. We're at EMC World 2014. Day three of wall to wall coverage on theCUBE, not one, but two. So we can give you the double the pleasure, double the insight. Guys like Bill coming on. So we will be back with our next segment after this short break.