 Live from Midtown Manhattan, the Cube's live coverage of Big Data NYC, a silicon-angled Wikibon production made possible by Hortonworks, we do Hadoop and WAN Disco, Hadoop made Invincible. And now your co-hosts, John Furrier and Dave Vellante. Hello everyone, we're live here in New York City for Big Data NYC. This is the event where we're covering what's going on in the big data world in New York City, also covering Hadoop World and Stratoconference going on. And we have a special presentation this week. We've done a lot of things, we've launched companies on the Cube this week, we've done product announcements, we've talked to thought leaders, but now we're doing the book signing. So we're pleased to, John and Dave are here with one of our favorite guests, the Dean of Big Data, Bill Schmarzo with EMC, who wrote his first book, Big Data. Understanding how data powers big business. Bill welcome back, Cube alumni, Dean of Big Data, I think we called you that Stratoconference three years ago. Yes, yes. You're the ones that coined that term. Tell us the story, why the book? So the book really, the inspiration for the book really started three years ago, John, when I did that Stratoconference, I did a session at that conference that was called the Big Data NBA. And the aspiration or the goal of that session was to really help the technology people understand how the business should be thinking about employing data. So it took a very business-centric approach towards understanding the value drivers and key business initiatives and how big data could impact that. So I did that session. I think John and you and Dave were the guys who coined the term the Dean of Big Data, right? So from that point forward, whenever I started to write blogs, I had a framework in mind of what I wanted to tell people. In fact, I was still working on a class around the Big Data NBA. I'm going to talk to the University of San Francisco about actually piloting the class. But it all kind of came together. The, you know, the Wiley folks reached out to me courtesy of EMC. We had a lot of material from talking to customers and from writing my blog and such. And it just all came together. Well, you're a tech athlete, I'll be saying. So I would love to get you while we're live, sign the books, you know, get well soon, John, you know, give me a little John Hancock on there. We're excited to go ahead and sign it up there. Dave, Dave, we're inspiring books now. I mean, theCUBE is the open source content model. We love it. Thanks for doing this, Dave. We've been having so much fun for four years now. We've been at Hadoop World. Well, it's just awesome seeing you got to get me too, Bill. It's fantastic, John, just seeing the way the whole space has developed. Bill, I remember it wasn't that long ago. Maybe two years ago. I'm not even sure it was that long. But two years ago, it was like an EMC-CIO conference and you did a breakout on big data. And it was really interesting. It was a CIO audience. There was a lot of trepidation, you know. A lot of, very much so, a lack of knowledge about big data. You were sort of doing the big data one-on-one at the time. And now the compression of knowledge has been so rapid. It's quite amazing. Yeah, to me, what's interesting is that the technology sands underneath the big data discussion are still morphing quite feverishly. There's new enhancements to existing technologies. Hadoop versions and all the Mahoot and MapReduce and there's all kinds of new products coming out. While those sands are going through that metamorphosis, the business sides are starting to realize that there is real value in the data and then the technology that tees the insights of that data. Especially if you target organization's key business initiatives. If you understand what it is the organization is trying to achieve, whether it's acquiring customers or retaining customers or driving on-time delivery pickups or improving hospital readmissions, if you think about it from a business perspective, it really puts a frame around what data do I need to improve that business process or that business initiative, and then what technologies do I need to tease all those insights. And when you take a business-like approach to this, it really takes this almost mind-numbing amount of technology and greatly simplifies it. You know, one of the inspirations when we started Wikibon was Don Tapscott's book on Wikonomics and the whole premise of the book was, you put it out there and people will find things and you didn't realize and good things will happen and it sort of worked out that way. And I've noticed in your book there's a great line here in Chapter 5, one interesting aspect of big data is how it is challenging the conventional thinking regarding how the non-analytical business user should be using analytics. To me, that's the huge opportunity here that we're not even beginning to scratch the surface. If you could talk about what your experience is in terms of putting analytic tools in front of the non-analytic, non-typical BI users, which has always been the promise of BI, which never happened. Is it going to happen here? Well, Dave, you hit on something that's probably my single most biggest passion about big data, which is if we create all these great analytics, but we don't surface them, present them in a way that's actionable to the end users, why bother? Why bother doing it? The ability to take and to conduct and do all that high value analytics munging through all the data and then how you surface it up to users changes dramatically. So the average user is not an analyst. If you're talking to a brand manager or we're dealing with a grocery chain talking about how we empower the store managers. Well, they're not analysts. They don't want charts and tables and Excel things and BI tools. They want to know what's working and what's not working. They want to know, should I buy more tide or not? Hey, there's a football game next week at Stanford versus Cal. You better have more beer on hand. And by the way, Stanford and Cal fans don't drink Budweiser. So you better make sure you have a higher level of beer in there. So they want to be able to have the technology minus data, tell them what's going on in the data, and then make recommendations that they can act on. So you are in the services organization at EMC. So how are you spending your time these days? Are you consulting? Are you helping build solutions? So I have by far the best job inside the EMC, which is I get to spend all of my time with customers. And what I'm doing with the customers is really working them trying to figure out where and how do we start. What business problems are we going after? What data do we have available to us both internally and externally that helps us to drive that process better? And what technologies do I need to empower that? So I'm able to talk to customers and I met with the CEO and his staff of a large theater chain, right? Trying to figure out how they get more people more butts and seats during off times, right? We had scheduled a half hour meeting with the staff. Half hour in, he tells everybody, call your next hour and a half worth of meetings and cancel. We're going to spend two hours here. And we were up on a whiteboard talking about how do we leverage this data? How do we leverage a mobile app? How do we do this? And it's a great job because I get a chance to see firsthand people trying to tackle really difficult business problems, leveraging all the capabilities that big data has now brought to us in the last couple of years. It's interesting, you mentioned in your book about it's for big business and people using business you mentioned kind of the MBA kind of approach. I mean this is the kind of discussion value change. You mentioned, you know, a lot of MBA concepts in the book. It sounds like customers are pretty intoxicated by the possibilities. So the question I have for you is are they seeing the use cases? Is it more of just kind of, you know, exciting them a little bit more? Are they seeing the use cases? Because that has come up as the key issue right now. Validation for big data is there. Now it's okay, what use cases? Is it picking shoes at this point? Are they trying to identify it? So I think what we're seeing is that the organizations are seeing the use cases. They're reading about them in the press. They're seeing them at conferences. But they want to know what it means to them. How do I make this relevant for my organization? And so while, you know, the story is about Facebook and some large banks or some large insurance company may be interesting, for them it's not concrete. Google can do it. Well they like to say, we're not Google, right? We're not Facebook. They want to make sure it's relevant for them. I like to coin this term. I think it's in the book. I call it the 4M's of big data. Make me more money. That's what they want to do. There's a point in the discussion and that's where it starts to become interesting, where you take a part of the business, a particular business initiative they're trying to accomplish and you try to, you build that out, you mock it up, you build a lab around it, you show them ramifications, you show them the business case, you show them the ROI, you show them the lift. That's when they start getting excited, because now you're talking about them. And so I think with their issue, they've got with many vendors, many vendors talk about the vendors. And so I think when you get into a dialogue with a customer, you're talking about them, what's interesting to them, what's most important to them, and you can show how big data can make that come to life. That's when things get really exciting. They get all jazz and you start to see some of the organizational boundaries start to meld the side. I want to ask you about the organizational issues. We were at the Q party last night, saw you there. One of the surprise guests that walked in was MIT Data Quality Conference. It was a symposium for Chief Data Officers. And the premise of that event, Bill, was really that the Chief Data Officer, if you want to be a data-driven organization, you've got to have a Chief Data Officer. And that individual needs to be separate from IT. Now it was a controversial concept. Now many industries, particularly financial services, government, maybe healthcare that are highly regulated, you're seeing that take place largely from a governance push. But not so much from a make-me-more-money push. So I'm wondering your thoughts on that whole Chief Data Officer role. Should there be a data czar? Should it be included? Is the CIO the data czar? Should it be separate or integrated? So is this a setup for the book? Totally. Actually, I have a chapter here. I talk about the Chief Data Officer. I figured you must have, but I really want you to weigh in on this. I think the Chief Data Officer, I think you're spot on. It's separate from IT. In fact, my recommendation is it's an economics major. Who understands what the data is worth? If I'm going to go out and try to acquire data, how do I put value around what that data is worth to me? And then also an economics person can look at it from a risk and compliance perspective. What are the costs associated with not being in compliance? So I totally agree that organizations need to have this Chief Data Officer who's got this economics background and sort of this hunger for how do I bring more data in? Likewise, I eventually think you're going to see what I'm going to call a Chief Analytics Officer, which is somebody who is going through the organization uncovering all this analytics IP that organizations have and even, you know, corralling it, inventorying it, versioning it, and maybe even putting legal bounds around it. So, you know, somebody with a law degree might be a better, a really good Chief Analytics Officer who's trying to take that IP that you developed and actually put legal protection around it. So the swim lanes of this new organization are kind of funky because you've got data science, you've got your Chief Analytics Officer, you've got legal, you've got governance, you've got information management. Do those all sort of report under the CDO? Is there a mix? Is there a matrix? I mean, I know it depends, but it's a complicated situation. And you've also missed what I think is what we mentioned earlier, one of the most key roles is my user experience. I'm not sure I have a user experience officer. Maybe you do. Organizations should. Most organizations, you know, the good ones are really few and far between because they don't really think about it from a user experience perspective. Think about it from a corporate messaging perspective. So, yeah, I'll add to that mix by saying the privacy security. So you need to have a user experience. What the hell is the CIO going to do? Just make me more money, I guess. No, I do think, but your hitting spot on is that you're going to have all these new roles. I don't know if they report the CFO, they report the COO, they report to I'm not sure where they go and we'll learn as we think. But I do think those roles, no matter where they report, those roles are critical. A person who's day in and day out focused and they wake up in the morning and go to bed at night and they're thinking about data. How do I get more of it? How do I protect it? How do I get value to it? Somebody on the analytics side trying to figure out how do I leverage all this analytics IP I have? How do I patent it? How do I maybe even resell it and monetize it? And this user experience person who has a passion for the user can say are we we capture all this data about our customers but are we acting out in a way that's smart, that is in compliance with the privacy sort of thing. So these new roles I think are going to pop up, I just don't know where to start. And it is somewhat of an organizational do-over to use a phrase of my friend Paul Gillan if you just pave the cow path and roll data in not going to be as effective as you really think about and how to best leverage these new roles. Oh, I think your spot, in fact, I think that's a key point is because today we treat data as a cost element. I mean data warehousing technologies are so darn expensive that we've gone through a process of trying to minimize much data we have. So if you pave the cow path, we'll never bring data in, right? So what we want to do is we want to get to that point where we've got that organizational reset where you're bringing in data as an asset and you have somebody who's really focused in on making that successful. So a long ago where, you know, data was something that had to be managed, reduced data, compressed data, back up it's just insurance that Chuck Hollis wrote about this. He said the bit has flipped, you know, from that to make more money. So let's talk about the crowd chat. I love this crowd chat. Dave knows all I do is talk about crowd chat these days. But we're getting some great interaction on crowdchat.net slash Stratocommerce. But it comes up that the concept of data artistry is interesting, right? You mentioned chief user experience officer. Are you seeing customers thinking about the art of big data in the sense of this certainly is science. We're going to get data science. You know, we had a quote yesterday on our crowd chat around data science. It's about 200,000 data scientists in the world. And that number is certainly going to grow. But there's over 2 million analysts that aren't like the Python writing data science geeks. But those are future data science and training. So over 2 million people that will be kind of quote the new data science. But for normal people, right, users, you mentioned, is there a chief user experience officer opportunity around the art of big data? What have you heard from customers? Are they that level of of walking erect, if you will, in the spirit of, you know, evolution of big data? Are they still kind of, you know, still in the early days of just learning to walk a little bit? How far, where are we on that whole spectrum? That data artist concept is pretty cool. It's a cool idea. And I'm not sure it's a role as much as a characteristic. I mean, I would want my user person or team to have sort of an artist aspect. And when I think about artists, I think about somebody who is trying to you know, carve out an experience that's very rewarding, interesting, natural and actionable. I also can see, you know, there's artistry in the data science that the chief data officer in trying to be creative and paint a picture with different data sources. You know, do I bring in local event data by screen scraping event bright? Do I bring in, you know, Zillow housing price data in order to get a better feel for the wealth effect that my customers might be having? So you kind of have that artist aspect of and it's kind of like experimentation, in my opinion, but it's got a finer edge to it. It's not just experimenting for experimentation purposes. Experimenting because you want to change how things look and feel. What are your thoughts on this big data? You know, this is coming back around because it's becoming quite obvious that, I mean, we had Avi Metta on yesterday and he was talking about some of the things that his firm's working on. One of the big areas they attacked was risk. We probed down a little bit, talking about risk of credit risk. Okay, well I can very you know, pretty accurately start to infer race, religion, sex, you know, based on a number of things, what people are listening to and what they're buying, etc. And there really are no privacy guidelines around that. So it's like the Wild Wild West, people are going to start collecting all this data. It's almost like the music's going to stop and whoever has the most data ends up the next axiom. So what's your take on what's going to happen there? What is the state of privacy? How much are clients actually thinking about that, either exploiting it or being careful about it? And where do you see it all going from a, you know, government standpoint? Well I think privacy is on the forefront of many organizations, especially large organizations. The Fair Credit Reporting Act, as you alluded to there, was very tricky about, you know, you can't make credit decisions and credit extension decisions based on, you know, race and sexual gender and things like that. But you're exactly right. I can interpret a lot of that. I can score that. I can score that. I can be fair and accurate about that. So I think organizations are actually really concerned about that and I actually think that we could very well see yet another role around decision governance. That is, just because you know something do you act on it. And my favorite example in decision governance is the target story, right? Where Target had scored this one person was a female and was pregnant and they were showing her ads for diapers and baby cribs and the father saw this and freaked out and went to his local target store and read them the riot act and then found out two weeks later she was pregnant, right? So and a lot of data scientists are pretty proud of that story because they knew before the father I think that's a horrible story because I think there's a point here where just because you know something doesn't mean you should act on it and I've got to believe that if Target could figure out that was a girl who was pregnant, they probably had a pretty good idea what her age was and they probably knew that she was underage, right? They had a score that probably said her age is in this range here. So I think you're going to see this this emergence of the decision governance type role. It's going to look at privacy issues. It's going to look at the data they're uncovering and they're going to try to figure out and they're going to make a decision at a corporate level whether I want to act on that data or not. Again, just because I know something doesn't mean I should act on it. And many times I don't want that decision of action left to my campaign manager or my webmaster who may not have the same kinds of privacy concerns that my chief officers are going to have. What you're calling this? You think it's going to be self-policing or you think the big hand of government is going to have to come down and adjudicate? I think the way the government is playing right now, I think government wants to get involved in everything. So I think the government will the problem with the government is there always a decade behind. I think they're still trying to figure out how to regulate faxes right now. So I just don't know if the government will be able to it can't move as fast as technology. So I think that it's what's going to happen is a critical mass of organizations are going to have to come together and put together some rules that are both protect the user and still allow the business to do what they want. So they don't the government's going to come in and they're going to come in with recommendations that are a decade old. So I want to follow up on that because you know we all know government's not one government there's like a zillion governments and so the government's a decade behind in so many aspects but one that they seem not to be a decade behind on is big data. You look at the NSA and see what's happening there. And open data initiatives underway. That's wonderful right. The data.gov stuff. I agree. And a lot of that intellectual property is starting to seep out into the commercial world right. It's like to see the guy some squirrels spin off from the NSA. So you know it's another example perhaps of the government funding you know innovation right. You saw it with nuclear you've seen it you know the internet you know the solar right. I actually think you know I'm actually really excited about the open data initiative you know and what the government is doing as far as making all this data available I can be honest with you our data scientists use that data all the time. We're constantly bringing that data into our projects. We're trying to figure out if we can tease insights out of it and it's wonderful that they've gone sort of open sourced all this different data and so you know I kudos to the government for doing that. As you said there's parts of the government that part of the government is getting it right. The people who regulate tend to struggle. So we have another comment on Twitter here I want to run by you because this is good comment from a thought leader out there he says at Furrier I find it I find that getting started by asking one question then will answer many more to follow. Math is just a part of it. I would claim logic more. So the conversations developing is you know math is great but it's the insight it's the art of the big data so that's kind of the use case kind of conversation. So his point is it's iterative. If you're going through it you find that you agree with that statement and what's your experience with customers is it like let's tackle one thing that opens up a window into more or is it more throw a bunch of stuff on the table. I think that's a great point. I do think it's very iterative. The one thing that we have found is that every company I've dealt with knows what questions are trying to ask and answer. What they don't understand is how does all these new data sources and technology enable me to ask and answer that question either at different level of granularity or different level of frequency. What happens when people start getting their questions answered and we kind of learn this in the BI space they ask more and more. So one of the things in the big data space is we see this iteration process happening more quickly. Is that people are starting to see the answers and they're asking the next level of questions and the next level of questions and it is very iterative and it is a fairly logical flow that they will ask the first level of questions and they'll keep drilling down. They want to know why, what happened here and etc. So I agree with that point. I think it is a very iterative process and it's part of that. We also tell organizations don't worry about finding the business problem to go after. Find a. Don't pick the pick a business problem because just get started and innovation and creativity is contagious. We run these vision workshop processes and I'm always amazed once we start this process going how somebody who has been sitting back has been quiet and really doesn't seem to be too engaged. They're the one coming up with the best idea. And the group just sort of plays off each other. Like the bottom of the line up in a baseball game. That's the same. We're at the GE event and one of the things is anyone can be Billy Bean to use the money ball example which in our industry is a little bit overplay but to the average person, oh that makes sense. Money ball is a common thread and I want to ask you that because that's a key value that anyone in the organization can be an analyst without being an analyst. It's a unique perspective and can contribute. So that's a key point. I also want to go back to what you said earlier which is getting a lot of traction on Twitter. The 4M's of big data make me more money. Which is clever. It's a great sound bike. Always good to have that on the view. Four P's of marketing, put your MBA hat on. 3V's of big data, not the 4M's of big data. We've got all kinds of slogans here in the queue but that's a good one. Make me more money. But what that really means is all businesses are in business to make profit. But it's not about just the money but the user experience. It's about the customers looking at the things that they do that make money. Is that what you mean by that? Is that why it's getting a lot of traction? Yeah, and it's a good point is that organizations are with all this customer data and insights and the 4M's as kind of the corporate charter, they have to carefully balance about the making money portion versus the user benefit portion. A grocery chain would love to have all their customers buy all the private labels. Margin's better. They're going to make a lot more money if you came on and buy generic billy beer and things like that. But the customer experience if I'm pushing things to customers that don't benefit the customer eventually the customer is going to revolt and go away. So it's not just on margin and you're saying don't push billy beer if that's not what the customer is. Serving the customer is what drives the business model. Exactly, you've got to look longer term. Not only how do I build try to get more revenue and profit from that customer which is in balance with the experience but I ultimately want to build advocacy. I want customers out there who have a high likelihood to recommend, who are referring other people to me. I want to build a community around what I've got to offer. And if I'm short changing somebody even for a moment taking advantage of what I know about them they're going to revolt and they're going to vote with their feet and they're going to vote with their pocket books. So that's a very tricky balance. Okay, so I want to just plug the book again, congratulations. So we're doing book signings here. Innovation in the queue, we have books from thanks to Wiley and Sons, we appreciate that. So big data, understanding how data powers big business. Obviously, we were at the GE event Industrial Cloud, Dave and I were talking with the CEO of GE and his staff and top customers like United Airlines Oil and Gas customers, Apache Com, Med Utility and then, you know, Health Care. These areas have big business issues and they're big businesses. They have machinery, they have it's not just math and databases. It's actually, you know, hard assets. That's where the big data world's going. Do you talk about that in the book at all? Is it still more tech industry related? And how do you relate to industries that are newly impacted? So in the book there are numerous case studies, examples of how different kinds of companies are using data or could be using data and technology. I look at it from an industry perspective in many cases. I talk about health care. I talk about retail. I talk about insurance. I talk about banking. I also look at it from a business function. I talk about sales and marketing and procurement. So the book is chock full of different examples. And the reason why it's full of examples is because I got customers telling them to me. So the book is really if I were to get credit to any one organization, it's my customers. Who I meet with all the time, who have all these great ideas, who have these challenges, who are freely sharing stuff and all I did is just capture. I listened really well. I'm not exactly a sharp guy but I listened really well. You're getting data. You are extracting the signal from the noise out there. The people who spend the money who actually are in business to serve customers and have an inherent reason to do big data. That's the shiny new toy. I have the best job inside the EMC. I get to spend all my time talking about it. Tell us a story about how the book was written. You log a lot of miles. We're Facebook friends so I see the status updates. Stuck in Iowa. On the tarmac. It's fun to hear and keep track of how we talk sports a lot in Palo Alto. You chipped away at this. Tell the story of, you know, obviously inspired by the Cube and other MBA programs. How did you just get it done? Take us through the process of writing. While he contacted me and they had taken a look at my blog and said you have a lot of interesting material in your blog. We think you have the ability to turn that into a book. My first reaction was, a book? That's going to be a lot of work. I was sitting down with my family for supper one night. I had been contacted about writing a book. I don't know about it. My daughter said how much money are you going to make? You don't make much money on a book. My son said why don't you donate some of the proceeds to something interesting? My mom died from breast cancer. Part of the proceeds of the book are going to be donated to breast cancer. That was the moment I realized I need to write this book. There was something there that I had a chance to give back. That became the motivating factor. I spend a lot of time on the road flying from San Francisco to Chicago like a four and a half hour flight or going to Kansas City or New York or Boston. Those are long flights. I have seen every movie on the airplane. What are we going to do? I write in hotel lobbies. Starbucks across the street. You are a big guy. You have elbows. You have carp tunnel. I was on the road so much. If you are spending 20 hours a week in hotels and airplanes and such, it ended up my goal was to bang on a chapter a week. It took about four months or so to get it done. We are here live in New York. This is the Cube, our big data NYC covering all the action in big data in New York City, Hadoop World and Strata Conference right across the street. This is an amazing segment with our friend and gas Cube alumni Bill Schmarzo who we coined the dean of big data because he was teaching people. That is what people were learning three years ago and now ultimately it is validation. The market is growing. The new book. Understanding how data powers big business. Read this book. It is really great. It is more business oriented. If you have an MBA, you are not a geek but you want to understand how to get your arms around it. Read this book. It is a great book. Bill, thanks so much. We will be right back.