 Live from the Fairmont Hotel in San Jose, California, it's The Cube at Big Data SV 2015. Okay, hello everyone, I'm John Furrier of SiliconANGLE, The Cube, and I'm here live in Silicon Valley for Big Data SV 2015. This is our second time in Silicon Valley, our fourth Big Data event where we are out getting all the data from what's happening in the industry and sharing that with you in conjunction with Stratoconference, Hadoop World, which is going on right across the street again. We are bringing live interviews, coverage, analysis, here with The Cube, our flagship program. We go out to the events and extract the seeds of the noise. I'm with Jeff Kelly, your co-host for the week here at Big Data Analysts at Wikibon.org, and we're excited to share with you all the exciting news, analysis, events, and there's a ton to talk about. So, Jeff and I will be here all week, bringing three days of wall-to-wall coverage here in Silicon Valley in San Jose around Big Data and a lot of news, people going public, new startups are coming out of the woodwork, big players, and the theme really is follow the money, and we're excited to have a big event tonight. Jeff Kelly will be sharing some research and we're having a party tonight at seven o'clock here at the fair months of your watching, spring by at five o'clock for our presentation, and then party at seven here at The Cube again. Our fourth event, second time, second year here in Silicon Valley, Big Data SV. We have a crowd chat, crowdchat.net slash bigdataweek, is the URL, come join the conversation. Jeff, exciting week, we have tons of news. Pivotal and Hortonworks and Industry announced an open data platform. Cloudera has kind of hinted they're going to go public, release some numbers around their earnings, and just in general, the philosophy for the startups is I got to find a home and I got to find some customers. So, I want to get your take. I mean, what do you see? You've been here on the ground with Pivotal. What's the story, what's happening? Yeah, so I was at the Pivotal event yesterday when they announced a few things. The open data platform was essentially a new industry consortium focused on hardening the Hadoop core to enable more enterprise adoption. Some pretty big names that are part of that group, the ODP, so you've got Pivotal, you've got Hortonworks, IBM, you've got GE, Verizon, some others. So that was a pretty big announcement, as you mentioned. From continuing on Pivotal and with their news, they've open sourced their entire Big Data suite of products, that's the Green Plum database, Hoc, which is essentially their massively parallel analytic database that runs native on Hadoop, Gemfire for essentially more transactional Big Data. So, they're going all in with open source between the ODP and with that announcement around open sourcing those tools. They've tightened their alliance with Hortonworks. I'm going to make all those Big Data tools, the Pivotal tools now able to run on Hortonworks data platform there, Hadoop distribution, so a lot happening there. Kind of almost, you could say in the other camp, we've got Cloudera who made some announcements as well, announcing that they've had $100 million in revenue in their fiscal year, 2015. So, interesting, that's the first Hadoop distribution player to hit that mark, so that's a big milestone for them. I think what we're seeing here is the swim lanes are starting to really solidify. You've got the open data platform and that group of companies on one side, and then you've kind of got Cloudera and Intel kind of in their camp, which is actually not a bad thing for the market. You're starting to see some contraction, some consolidation I should say, in terms of the different players. And it's good that there's a couple different options that customers have when it comes to Hadoop and Big Data, but I think you're seeing this market mature and that's a good sign. From a practitioner standpoint, what we're looking to talk about this week of course is where does it stand in terms of adoption Hadoop's generally, but Big Data even more broadly in the enterprise. So, we'll be very interested to hear from some of the practitioners that are gonna join us on theCUBE over the next couple of days about that. What we're seeing in the market, talking to Wikibon community and the research we're doing, we're finding there's kind of two ends of the spectrum in terms of adoption of Hadoop. You're seeing the big global 1,000 companies, they're all going in with Hadoop for sure, there's no question about that. On the other end of the spectrum, we've got some of the smaller, what I would say, born data-driven startups where they have Big Data kind of built into their DNA and they're going in certainly with Hadoop and some of the other developing technologies, Spark, et cetera. So, you've got those two factions, but then you've got this big middle, the rest of the enterprise landscape and there's not a lot of action happening right in that space. So, we'll be talking to practitioners, we'll be talking to the vendors, we'll be talking to the VCs, how are we gonna move that forward? So, we start to see some adoption beyond those really big enterprises and then of course those really nimble, exciting data startups. And then in terms of the state of adoption or the success of those that have adopted Hadoop, we'll talk about that. And the fact is, even some of those early adopters are struggling. It's still a challenge, there's still a lot of different pieces that you need to put together to make Big Data work and even some of the big banks and some of the big retailers who were kind of the big name brands that are out there and are touted by the Hadoop companies as hey, they're using Hadoop, even those companies are struggling a little bit. So, we'll talk about that and what it's gonna take to kind of move back that forward. Let's take a step back and just look at the big picture. So, obviously we've been covering Big Data since the really inception of the industry we saw Hadoop come on the scene and explode. Obviously that became a huge trend, seeing a standardization now happening with Pivotal and Hortonworks in the industry rallying around a standard that's in direct conflict with what Cloudera is saying, they obviously didn't join the alliance. Then you have the general enterprise market and then also the capital market. So, I want to get your perspective. We had the Big Data of New York City event which really focused on Wall Street, the capital markets in New York. Here in Silicon Valley we're looking at the money side of the private side. So, you know, obviously Cloudera is not yet public. Hortonworks is public. And then you have a slew of startups out there trying to find a position, if you will, for swim lanes, as you say, for the big guys and the little guys are trying to find spots. So, I mean, where are we? Obviously phases of evolution, we've seen phase one. We've been talking about that on theCUBE, you know, early adopters. Are we in phase two? Are we crossing the chasm? You know, we were speculating that Big Data might not be an industry, but might be more cloud driven. Where's the evolution? Is the tide shifting? What's your take on the big picture? Well, I think we're kind of moving from that first phase of Big Data that was focused, as you say, on. That was characterized by some of the early innovators who were also the early users. So, you know, Hadoop really, and some of the Big Data innovations happened out in the marketplace with companies like Facebook and Yahoo Google developing this for internal use. So, we certainly saw that. Now, we saw kind of a market evolve around some of these startups and VC money flowing in. You know, kind of all started with Cloudera and then MapR and Hortonworks and then of course the whole ecosystem that built around them. And then we started to see the enterprises actually start to dip their toes in the water. So, I think that was kind of phase one. And a lot of those early deployments in the enterprise were focused mostly on cost savings, from an IT perspective. That is, you know, we hear a lot about data warehouse offloading, moving some of those workloads from your expensive data warehouse to a less costly Hadoop deployment, kind of laying the foundations for that so-called data lake, which hopefully then you'll use in time to build more revenue-focused applications. So, that's kind of phase one. I think phase two, we're starting to see a few things from a market perspective. Like I said, we're seeing a couple of different major factions kind of solidify. I would put, you know, one bucket, the ODP, that's Pivotal Hortonworks, IBM and GE and some of that group. On the other side, you're seeing Intel and Cloudera. I might even throw Amazon into that bucket as well. So, I think you're starting to see that, which is just, I think a natural thing in the market. You start to see a couple of dominant players start to emerge, in this case, not individual players, but factions. Now, the interesting question, of course, is, you know, where Strata Hadoop World is happening this week and if you go down to the show and you go to the show floor, there's dozens of startups out there in the space. Many of them focused on fairly niche tooling in the larger big data stack. And the question is, what's gonna happen to those players? You know, one of the challenges of this market, if you're a startup, is that big data requires, and we've learned this, you know, by talking to a lot of practitioners in the enterprise, but big data requires more of a platform approach. There are a lot of moving parts and for this to go mainstream, most enterprises don't have the internal resources, bandwidth, expertise, what have you, to bring together all these disparate tools, bring them together, cobble them together into their own platform. They are looking for more of a one-stop shop for big data. We saw this in the data warehouse space to some extent, where the appliance model, bringing the hardware and software together, drop kind of data warehouse in a box, drop it in your data center, you know, that really has become the standard approach for data warehousing. And we're starting to see that in the big data space. Now, the interesting caveat there, of course, is open source, where you didn't have that in the data warehouse space. And that's the good news is that, innovation, I believe, is gonna continue, even as we start to see some consolidation and more of a platform play around big data. But back to those little startups, I think the challenge for them is going to be, now, how do you build a business around a particular tool that needs to fit into a larger platform, the big data stack, if you will, that itself needs to fit into a larger data management stack, needs to fit into a larger infrastructure and some of the innovation that's happening on that side in terms of cloud and virtualization. So it's gonna be hard for those players to build, I think, long-term sustainable businesses. What I think, the good news is, I think some of those players out on that floor today will have some good exits, because there is gonna be consolidation in this market and some of the tooling that they're creating is very valuable. So for some of those, there's gonna be a good exit for others. There's not gonna be such a good exit, but that's the nature of a new market and venture capital flowing in and that's just the way market works. We can't not talk about the big three. Cloudera, Hortonworks, and MapR, obviously the innovators early in phase one, honestly, are they gonna get the leverage in this phase two? What's the early adopters on the enterprise look like? And obviously where's the capital market? Obviously those big three are heavily funded. What's your take on those three guys? Well, if you look at those three players, like I said, I think you're seeing a couple of different factions play out or develop, I should say. They are extremely well-capitalized. Between the three of them, I think raised over $1.6 billion over the last several years. And of course, Hortonworks has gone public and raised another 100 million plus in their IPO. So they're very well-capitalized. The question is, where's all that money gonna go? And why is there a need for that much capital in this market? Because isn't software supposed to be not so capital intensive in terms of building out a business? The challenges in the Hadoop business are a few things. One, from a distribution perspective, you've got to, it takes time to build a business based on open source software. You've got the community to contend with. And then you've also got to deal with your channel. You've got to break into the more traditional way of enterprises buying software. So that takes time. And of course, you've got to compete with the mind share. You've got to compete from a mind share perspective with all the big players who have lots of money and are very invested in this business as well. Whether that's Microsoft or Oracle or whoever. So that's where a lot of this money's going. What's gonna play out in terms of who's gonna be around in five, 10 years? It'll be interesting to see. I mean, I think this market can sustain, one, maybe two players at a large scale. But you mentioned three vendors. So, you know, we'll all- And there's a lot of other little guys. I mean, there's a lot of consolidation going on. That's your premise of your research. You'll be sending that tonight at five o'clock tonight. I have an event. What's your take on consolidation? Garnage, AccuHires, IPOs, what's going on? I think you're gonna see continued consolidation. We've already started to see it kind of on the periphery. We've seen companies, you know, and certainly in the open source business intelligence space with Ventaho, Jaspersoft being acquired, Revolution Analytics. I think you're gonna continue to see that happen in the Hadoop space in particular. Because, like I said, a lot of these different tools, while very valuable, really only become extremely valuable when they're part of a larger platform. So, you're gonna definitely see some consolidation there, some acquisitions, and I think that's natural. It's good for the market. You know, the enterprise is telling us this. This is what they're, you know, at the Pivotal event yesterday, there was a, they had a customer panel. And one of the customers, I think, said it pretty well, Mitsubishi Financial. They talked about, well, we need to take a platform approach because we don't have the time we're not in the business of piecing together all these little tools and technologies into a platform. We do want kind of a one-stop shop. At the same time, they don't want lock-in. And the good news is, open source, to some extent alleviates that risk of vendor lock-in. And that's what the ODP is all about. Of course, there's some debate going on with ODP. You've got Clara making some comments about how it's kind of antithetical to the open source way. And I think there is a legitimate question that the ODP has to ask, has to answer, I should say. Which is why you, why do you need such an organization? What is ODP going to do that the Apache community can't do? So I got to ask you, yesterday we had a crowd chat. It's still open right now, crowdchat.net slash big data week. And obviously Dave Vellante laid down some interesting comments who you can make of this week. It's snowed in, in Boston. So Dave, if you're watching, we miss you, wish you were here to add some of the commentary because, you know, what's clear is that big data seems to be announced is what I was speculating. And Dave was kind of teasing it out that it's kind of transitioning to a new wave coming in. Things happen in waves, wave one, wave two. Is the big data wave dying out or, or being diluted if you will, and moving to a whole nother level, whether that's virtualization, you're seeing a lot of conversations on our crowd chat about converged infrastructure, you still have storage, you got the Mware doing some things. We covered that at recent announcement. So is the dynamics at infrastructure scale affecting the big data business? And if Hadoop can become standardized and you still didn't see a rally around that, is this next wave going to be something different? Is it going to look different, the same? What do you see? Obviously big data, it's not just data warehousing. That's one aspect of it. You know, I'll see, there's other things that could propel this growth. And so, in some speculating, there's a bubble that's going to burst. Is there another wave? Because bubbles don't burst if there's another wave coming. And that's what Dave was basically saying. I think there's definitely another wave coming. And it's characterized by three things. So number one, you're starting to see this consolidation around Hadoop as really the core foundation of a big data platform. Things like the ODP are going to accelerate enterprise adoption. I think there's no question that's kind of part of the next phase. Like I said, the good news is, the innovation isn't going to stop. And that's because of the open source community. So there's going to be continued innovation from no one else, from the practitioners out there who are developing these tools themselves. The data born startups, companies like Facebook and Google and even Netflix is now open sourcing some of their Hadoop related tools. So you're going to continue to see innovation. And that's what I think one of the key factors that open source brings to this market. And then the last and really important part of this phase is where it gets really interesting is new applications and specifically applications around the internet of things and the industrial internet. That's where it's going to get exciting. Where you're actually, we're moving beyond kind of the container. Oh, how do we store and do some processing of all this data? How do we save a little bit of money on our data warehousing? Moving from that to, how do we do things in a totally different way? How do we develop new lines of business, drive new revenue? How do we find efficiencies? You think about some of the things that GE is doing around predictive maintenance kind of things. Just a little bit of a shift in terms of your efficiency there can mean big money. So I think that's where you see a lot of the action and that's what's going to be very exciting in the next phase of big data. Well, it's going to be exciting week. I'm looking forward to looking at a couple key areas on my radar, which is the overlay between different markets. I'll see cloud is exploding. You got the converged infrastructure market. I'll call that infrastructure and you got big data. I'll call that apps and others. Where you got DevOps, you got internet of things, big data warehouse and also infrastructure with virtualization and stuff overlaying. So I think there's going to be concentric circles around those markets. What's interesting is that the interplay between those forces is going to be very interesting to watch and I see big data kind of diluting into its own little market where internet of things traverses DevOps and infrastructure. So I think that's clearly a big way but I'm really interested to see what I call infrastructure software. Software eating the world by Mark Andries and certainly a relevant soundbite but the reality is is that are we just in a new market called infrastructure software and I think that is where these concentric circles come in. So I'm really interested to dig into these forces. We'll be covering it for three days. This is theCUBE live in Silicon Valley in conjunction with Stratoconference and Hadoop World. This is big data SV, our exclusive event here, Silicon Angle, Wikibon. We'll be right back with our next guest after this short break. Stay tuned for three days of wall-to-wall coverage. We'll be right back.