 Okay, we're back live here at Strata Conference in Silicon Valley, the heart of innovation where all the new invention is happening and here an industry is being created, not only in Silicon Valley but global entrepreneurs from around the world are here. I saw people in from New Zealand, Australia, Asia, everyone's here really because it's not just birds with feathers, it's not just a tribe of intellectual and programmers and business people. Big data is creating the massive innovation of all time and people are learning, they're building and they're creating products and services and we are here inside the Cube, SiliconANGLE.tv's flagship telecast where we go out and talk to the smartest people, extract a signal from the noise and share that with you and we are here with Bill Schmarzo who's being called the Dean of Big Data, mainly because here they have this kind of Big Data scientists MBA-like program where they're teaching people about Big Data and helping people along. I'm John Furrier, I'm the founder of SiliconANGLE and we're inside the Cube and I'm joined by my co-host. I'm Dave Vellante from wikibon.org. Now Bill, I understand you recently went through this big data sort of data scientist, not of a certification but education process, right? Yes, I actually went through, EMC has a data scientist certification process, it's a week long class to really help people sort of get a flavor and a taste for what's involved to be a big data scientist? Yeah, we just had Mike Daburon from Battery Adventures and he was saying the gap between those who, you know, the skills gap really, those get it and those who don't is enormous, bigger than he's ever seen before. So it's a huge problem that you guys are trying to help address. But so you went through this program, what was it like? I mean, was it more like definitionals, this is what Hadoop is, or were you able to get deeper than that? What was the experience like? I have to tell you that the class was really impressive. There's a lot of hands-on activity. You actually worked with data sets. The class was primarily focused in on statistical analysis and applying, all those stats class stuff that we had forgotten when we were growing up and took an MBA class, right? Well, those books had to be dusted back off and we actually spent a lot of time working with data sets, trying to try and to cleanse the data, trying to line the data, using different tools available out there, getting some insights into Hadoop. It was a marvelous class, very much hands-on experience. Yeah, okay, so I wanted to ask you, you spent a number of years at business objects, right? So you obviously have a place in your heart for whatever you want to call it, BI specialist. BI Brethren, yes. Yeah, so what's going on with those BI Brethren? Is it a good time to be a BI specialist? Are they under attack? What are you seeing there? And is this an opportunity or is it a big threat for those guys, or both? Yeah, it's, you know, Dave, it's actually both. It's both an opportunity and a threat. And you know, the parlor reason for taking that class is I wanted to understand in detail, what is it that makes difference as a data scientist from a BI specialist? Now there's a lot that the BI specialist understands already that is actually quite valuable if you want to be a data scientist. You understand your user's business environment. You understand the KPIs and key metrics. You're already building dashboards to interface into their environment. You understand the organizational politics and structures and such. So there's a lot of things as a BI specialist that you know already. Now the data scientist is learning a whole new sets of tools and techniques and how to really work with all this data source using things like predictive analytics and Hadoop and Hive and things like that. And so I think, personally speaking, I think a look at the BI, my BI brethren out there and I realize that there's an opportunity for you folks to really step up, to take all this time that you've invested to build all these sort of static retrospective dashboards and reports and to start adding more predictive analytics, start adding more real time to it. And really, it's an exciting time. It should be an exciting time for the BI specialist because a lot of the skills that they've been honing, now they have a chance to supercharge using a lot of this data scientist stuff. Well, you know in the 2000s very well. I mean, a lot of that was applied for reporting and the whole industry's trying to get beyond reporting. And I think they never did. It was a lot of KPIs and they were good, but they weren't fast enough and they had to build these statistical models that were too complex on top of them. You see that changing? Definitely. I see that the ability to build statistical models, algorithms has remained dramatically. The idea of building these analytics sandboxes, they use Hadoop as a way to channel data back and forth between your sandbox environment and different sources. Now, everything is going to happen much more quickly at a much lower level of granularity and the results will be much higher fidelity insights and analytics, but how do you surface that result of that? And that's where the BI skillsets pop up because they've already got dashboards sitting in the call centers in the VP of marketing's office, in the finance office. And so the ability to surface those insights into an existing sort of BI environment is a huge opportunity. So I know John wants to jump in here. My last follow-up on this topic is, so you see, if I understand it correctly, the data science piece of it coming into the traditional business analytics, the traditional dashboards, and making them more real-time, maybe adding some more variables, maybe just using straightforward math as opposed to these complex models and using data to solve the problem. Yes. I see that the BI specialists who today focus on providing solutions to their customers who understand their customer's business, who have spent time to understand that and have used and leveraged the BI tools, now have a whole new kit bag of tools available, Hadoop, Hive, Big Data, all this sort of, R, Mad Lib, all this stuff's not in their kit bag that they can apply to this problem and to deliver really meaningful, actionable insights to their customers. So Bill, you're being talked about as the dean of Big Data here at the Stratoconference, mainly because as Dave pointed out, you have experience not just now with EMC, which is doing some good work there, but business objects and we were talking with Scott Detson, who was the Silicon Valley CEO and WebLogic and there's a comeback of these 40-something entrepreneurs, systems guys, right? So you kind of fall in that camp and you've been around the block, right? Graybeards or gray hairs like us. What is the vibe here? I mean, obviously O'Reilly has this pretty cool emerging program called Jumpstart where they have office hours. It's kind of a cool environment where it's collegiate where you get this quote, data science MBA. That's kind of what they're talking about. But really there's the business impact of Big Data. We've been talking about it. That's what everyone's saying that there is there. But what's being talked about in your talks to the folks as you're teaching folks about data science. What are you talking about and what is the common level of efficiency? What's the IQ rate? What grade are they in? Is it like, are they freshmen? Give us an update on what's going on with this whole MBA-like program. So John, the class that I taught or the session I gave was really about how do you help customers envision the realm of possible with Big Data? Every customer I've ever talked to knows what kind of questions they're trying to ask, what kind of decisions they're trying to make, right? It's never been a challenge. They don't know their business. What they don't understand is what's possible. And so the session I went through provided three different techniques or tools. Two of them built off Michael Porter's competitive strategy book in the 1980s to help them understand how do I take what Big Data can do and envision the realm of possible within my business space, right? If my focused area is doing better customer acquisitions, how do I leverage more detail, more granular data to improve my ability to acquire customers to model that space? If my, again in the area of customer acquisition, how do I leverage social media data to improve my focus and my targeting? And how do I leverage low latency approach to do that more frequently and lower time and more quickly? So it's about, it's about giving customers, especially the business users, but also the data scientists and the people who support the business, sort of a creative whack across the head to sort of envision the realm of possible, what they could be doing. So Avi Mada was talking about yesterday and we had Mike Dauber on a young gun VC at Battery Ventures doing some cutting edge work. And his comment was interesting. He said, you know, it's not about Big Data and all the technical features about it, the applications that are emerging have to essentially mask out or encapsulate the tech around Big Data. So it's not so much a geeky thing, although it is that the folks actually doing the application development or application prototyping are business people. So can you comment on what your view is and perspective on that, on that area and what are you seeing from some of the attendees and experiences in the field? So John, I think that statement is exactly spot on is that as long as Big Data and all things related to Big Data are in a science experiment, we never get out of the bowels of the organization. But where we can attach Big Data, the Big Data initiative to what's key to the business, that's when things start happening. And let's be honest, it's hard to find an opportunity inside a company where Big Data and advanced analytics can't drive business value. I mean, you stumble on them all the time. And so it's really, the challenge is not only helping customers to envision. It's not what, it's not how to do it, it's what to do. It's where to start, right? Because you know from sports that nothing breeds success like success. So in the Big Data world, what you don't want to do is tackle, let's cure world hunger, let's solve cancer. You want to target a problem that has meaningful business impact, but you can know you can get the meat of the bat on the ball, right? So what's an example of that in your experience, some of your work that you've done in the field? So the one that's popping up time and time again is with companies who have built this pristine CRM system, right? They've invested tens of millions of dollars to create this common view of their customer. They've got it all in one center of location, everybody bounds before it. And now here comes social media data that has insights on my interests, my affiliations, my friends. How do I take that social media insights and integrate it into my CRM system to create sort of an ECRM or social CRM system? To me, I'm seeing that repeat time and time again that companies that are in the B to C space, business to consumer space, are seeing a wealth of insights buried inside all the social media data that they can use to compliment their existing CRM efforts to become much more finely focused, higher fidelity decisions and decisions they can make much more quickly. Dave, what's your perspective on this? Because, you know, we do a lot of research on the whole storage, data warehousing, business intelligence side. And there's a legacy piece. I want to get Dave and you Bill to talk about the challenges of, there's not a throwaway, it's an integration, how do you, I mean, people have these investments and they got to deal with them. So it's not about unstructured and structured, it's about a little bit of both. So what is your feeling? Obviously, EMC has storage presence in these huge accounts, yet Hadoop is kind of a green field opportunity where you can do new things, predictive analytics and now real time. How do you get the engagements going with these clients and saying, let's hit that single, let's get a double, not try to go too hard, but you got to deal with these legacy baggage. How do you approach that, Dave? What's your thoughts? You know, we predicted earlier this year and late last year that, you know, clear this was going to be one of the years of the intersection between the traditional BI, enterprise data warehouse and the new emerging, you know, Hadoop space and big data space and what I'm hearing from you is that's clearly happening and I spent some time as well in the predictive analytics field and it was always, I ended up getting really frustrated with the balance score cards and the KPIs and it's just, I never could get it over that hump and I think you're talking about the social media data which is a really practical example and I think you're right, that's a great place to start and then maybe we tackle some of those harder problems down the road. Yes, I think it's exactly, you got to target the practical examples, you got to build some success, you got to find the low hanging fruit but John, to your point about, will some of this technology that we've done in the past be thrown away? The answer is yes. We're going to throw away stuff we've done in the past and we should. If it's no longer the right architecture, if it's no longer the right approach that allows me to drive business value, then we should move on. We should throw away what we have. Now, does that mean we're going to go out and throw away all our business objects, dashboards? Probably not, but a lot of the things underneath that could very well get migrated to something different. I mean, what you can do with Hadoop, for example, is you can't do with regular relational databases at least today and so do you limit, do you put a governor on your business operations to stay so that your technology can remain consistent or do you break it free and say, you know what, there are certain problems where I've got to use this new technology even if it runs counter to the quote unquote corporate standard. Let's talk about the business impact of this area. So CIOs, CEOs, CFOs are all involved in big purchase decisions and they want to make sure they get the payback on all those legacy systems but how do you talk to those executive level people? Because you're doing a lot of whiteboards with geeks and doing a lot of stuff there but also, what is the message that you say to those guys and what analogy in the past, historically in the tech business, whether it's mainframe, client server, PC, do you see an analog to kind of what's happened in the past to help CIOs, CEOs and CFOs and business leaders understand what's happening right now? I mean, what, I mean this is a massive change unprecedented in decades. Yes, what I typically do is I share with people what happened back in the 1980s when the CPG and retail industries transitioned from bi-monthly Nielsen audit data, right? Every two months, Procter and Gamble get this big book that had all their important numbers, right? And then in 1988, IRI comes out, introduces scanner data and blows that book up, right? And blows up everything about how you do business but it took leading companies to make that first step, to find the low-hanging fruit. So the Procter and Gamble's of the world, the Walmart, the Tescos, the Frito-Lays, they use it for things like, you know, market basket analysis, category management, affinity analysis, things like that. They found those low-hanging fruit and they targeted those and they drove business value. They drove competitive advantage in that process. What was interesting is that at that time, technologies, some technologies had to be thrown out. Because when you think about it, what's really important? Is it the technology or is it the data? And at the end of the day, it's the data that's important. It's not the technology it's buried in. It's the data that I want to get at. And if that data is buried and you can't, if it's locked in the jail of a technology, then you're better off blowing that technology up and going someplace that frees the data. The Netflix comment yesterday, all over the Twitter stream was interesting where they said it's more expensive to delete the data than save it. So, I mean, what's your comment on that? Because that's to your point. I mean, you can throw away technology and systems, but you know, the data's becoming mingled and you made the comment on day one, our preview session that lock-ins no longer the data model. So you're seeing an interoperability kind of thing happening among data sets. That's interesting. Yeah, it's almost like an open source movement. But let me kind of tell a story. So yesterday, out of my session, I ran into a fellow who works at a large bank and I was saying, you know, how goes your big data initiative? And he said that we're in one of their businesses, the credit card business. And they have 18 different organizations that sell credit cards. And they have these, they basically have these silos, right? The traditional silos and their data hoarders, right? They won't share data across these silos. And the ability to leverage those insights across those silos about a single customer to know what people are doing are totally lost. So, is it the data that's, it's a problem? No, it's the organizational structure. It's the cultural structure that thinks that the secret to my assist is by hoard the data. And we know from the open source movement is that how do you grow? You give stuff away. You give stuff away. And as organizations that don't learn that lesson are at risk of becoming the dinosaurs. So I wonder if you could share with us, whatever you can share with us, the conversations that go on inside of EMC, because you talked about the bi-monthly data coming out and I harried data to completely blowing that up. I've followed EMC for a very long time and I've seen the company remake itself numerous times. You never talk about selling hardware, ever. And that's EMC's, you know, bread and butter, right? And you talk to people like Jeff Hammetbocker at Facebook. He says, hey, we basically, you know, bought into this whole Hadoop movement. So we didn't have to put all the data into a big expensive container, which is what EMC sells. And so now they bring on someone like you to evangelize big data. What's the conversation like internally? I mean, is there a conversation about, wow, this is an opportunity we have to hop on? Is it a threat to our existing business? Or is it just something that you guys in your gut know it's the right thing to do? I, you know, credit to the senior executives inside of EMC for realizing that the marketplace is changing. And if they stick to their old models, they risk becoming the next deck, right? Or the next, you know, compact. And so they know that in order to be meaningful and successful with their customers, they've got to provide solutions. They've got to be relevant to the business. And so they're not afraid to push the organization to have that conversation. And it does, I'm sure, I'm not involved in these product discussions, but I'm sure it puts pressure on the product groups to say, wow, if customers want to do X, Y and Z, we need to make sure our products and our technology can do that. But, you know, the company is realizing that our goal is to become a strategic partner with our customers to really help them enjoy success. You know, I love a quote, you know, our success is built on our customer's success. If our customers aren't successful, we're not going to be successful. And that's kind of the way the company thinks. Yeah, interesting. I mean, we all remember the remnants of the mini computer business. John, we were probably back east at the time, right? You know, Wayne, TG, Prime, right, digital. Well, the apps drove everything, right? So back in the client server, it was the cultural thing of mainframe to client server. PCs were kind of a toy and then they were networked together. So, you know, TCPIP was really grew out of the, you know, open standard stack movement of networking. So networking was the enabler, right? So what we've been talking about is what is the disruptive enabler now? We're seeing flash, we're seeing virtualization. Well, virtualization is more mature, flash is emerging, and now you have data. What is that next disruptive enabler? Because there's always a protocol, there's always some tech that's the lever for an industry. And again, I feel that this is an industry that's going to be a brand new industry completely different than the tech business. So I'm trying to understand and extract, what is that lever? So what is your perspective on this? To be honest on it, I don't know. I don't know. And I think we're at this inflection point industry where we've got to have, you know, our ears to the railroad tracks or to the cow path, or whatever it might be, we've got our ears on too, to really try to figure out what's going to happen. There is no doubt that data has become this corporate asset and that companies that are successful are going to find ways to free it up. What are the technology enablers behind that? I think at the end of the day, it'll be technology that helps people make better decisions about their business, to help them identify, you know, white spaces in the marketplace where they can insert new products and new services. You know, I think the reason why EMC from a product perspective is so into the space is that, first off, they believe, we believe that our products are better than the rest of the product in the space. That when you end of the day, when you want to build a big data architecture and a big data infrastructure, end of the day, it's going to have a huge number of EMC components into it. But at the same time, I think from a product perspective, you've got to be taking a look at where the white spaces are, and you've got to run fast, and you've got to be nimble, and you've got to be quick because we don't know where this story's going to end. Other than the fact, we do know there will be technology winners, and there will be customers who figure out how to exploit the data and the technology to change marketplaces. We are here, live, with Bill Schmarzo, who's being called the Dean of Big Data here at the Stratocommerce, mainly because he's so much experience out in the field, has a lot of history in tech. He's now at EMC, leading up as CTO of the Solutions Group Consulting Group, solving real customer problems. We're early days now, and this is great growth, so thanks for coming on theCUBE, and we'll be obviously tracking your stuff, and we'll be seeing each other in Palo Alto, talking baseball and sports, but congratulations for your success, and thanks for coming on theCUBE and sharing your knowledge with everyone. Thanks so much. Thanks guys. Thanks, see you, Bill.