 Live from the San Jose Convention Center, extracting the signal from the noise. It's theCUBE, covering Hadoop Summit 2015. Brought to you by headline sponsor, Hortonworks, and by EMC, Pivotal, IBM, Pentaho, Teradata, Syncsort, and by Atunity. Now your hosts, John Furrier and George Gilbert. Okay, welcome back everyone. We are live at Silicon Valley in San Jose for Hadoop Summit 2015. This is Silicon Angles theCUBE, our flagship program. We go out to the events and extract the signal from the noise. I'm John Furrier, the founder of Silicon Angles. I'm George Gilbert, our new big data analyst. And of course we have our premier guest, Rob Bearden, the CEO of Hortonworks, CUBE alum, king of the show here. Hortonworks is really headlining with other ecosystem partners. The show, great to have you here on theCUBE again. It's great to be here and thank you guys for bringing theCUBE here. It's been, the feedback's always terrific and always welcome to have you here. We really appreciate your partnership. You recognize the value that we bring with theCUBE. We recognize what you guys have done in the community. It's an open source ethos. Share and stuff will come back, right? And theCUBE is the gift that keeps giving all year round too, because it's just a great place to go. Our consecutive Hadoop Summit has been a pleasure. Let's get back down to business. Your keynote was fantastic. I thought you were, had that John Chambers vibe. He gave his last keynote speech last night as CEO at Cisco. Certainly he'll be on as kind of executive chairman, handing the baton to Chuck Robbins. So you had a spring for your step. Was it because you were just feeling good? I mean, it's a packed house. What's getting you pumped up here? I mean, the show's packed. But this seems to be like the markets in transition about to explode in a positive way. Where are we? You know, it's great to have our customers, our partners, everybody here together. But you know, we've all worked really, really hard for a lot of years, three, four, five years to be in the opportunity to hit this inflection point. And that inflection points now. Hadoop has really crossed over and it's absolutely become an enterprise viable data platform, right? The architecture has made it to the other side and really innovated into being able to address multiple kinds of use case and work loads and batch interactive in real time. And the ecosystem has really, really, not only embraced, but is now thriving and pulling Hadoop and we're seeing that pull market happen. We're seeing real production use cases of mission critical apps going in. And most importantly, we're now at a point where as we've hit that inflection point, and this is why I'm so excited, the real value's coming. And we've not just built tech to build tech, created a space. It's actually transforming the IT landscape and how data's going to be managed for the next 15 years. And that's what's so exciting to us. You know, Rob, we were talking, Dave Palante pinged me this morning. Hey, what's the vibe of the show? And you know, we have also, I'll see, you know, our speculations and we're opining. But I said, Dave, it's packed. It's energy's here. But we talk about this transition market. The tide's pulling out for this huge wave that's coming. So you mentioned an inflection point. I want to drill down on that. Because I think that's really what I see is that the tide is pulling out, the old market of analytics, processing engine, all the nuts and bolts, speeds and feeds, certainly foundational set. You would agree, okay. If you believe that, the tide's coming, the next big wave is cloud. We're seeing massive cloud innovation, cellular containers, orchestration. This is DevOps. I mean, Hadoop is DevOps. This is now coming together. What is the, on the inflection point specifically, how low are we? Because we can debate where in the inflection point kick up we are. If this cloud wave hits in strong, what do you see for that path? What is, what's going to happen next? Because it's going to go up. It's going to be an inflection. It's going to go way up. But it's even bigger than all of that. And that's what's so exciting. And that the entire data architecture is evolving rapidly. And the data layer is going to re-platform. And this time on multi-dimensions. Not only is it going to re-platform away from the standard silo transactional systems, which are great for doing what they do, but Hadoop creates a central architecture that all workload types, batch interactive, can come together and bring all data sets in a central and all the efficiencies that come with that. But what makes it even more powerful is now you can deploy data across any deployment architecture that fits the workload the best. Whether that's on-premise, Windows or Linux, whether that's in a virtualized, or whether that's in a cloud or a multi-cloud architecture. So you get not only a modern data architecture, but a modern deployment architecture that will make it transparent where that data is on which deployment architecture. So I was talking with Lou Tucker at OpenStack last week, two weeks ago. Lou Tucker, a legend in the industry. But now at Cisco is its son. And I'm like, Lou, let's try to tell the story for the common folks out there. That no tech. And try to put out an inflection point, wealth and industry creation capability. So we were kind of brainstorming it. I want to get your thoughts on this. TCPIP really changed the game back in the 80s. For networking. And brought in networking. Brought in Cisco, created internet working and all that goodness. And the internet all came on top of that. What is the equivalent? Are we in that kind of shift or double? ERP, it's ERP. I think I actually mentioned this in my keynote. Maybe I overlooked it. But what we're doing right now is exactly, if you'll go back and look at the value that was created for companies, for the ecosystem layer, and certainly for individuals. When ERP in the early 90s really began to be able to change the process workflows and the value prop for the enterprise and how they took orders, processed orders, managed inventory, planned manufacturing, and how they drove financials on a consolidated basis. And the value that was created. There's an interesting stat that I believe Peter Goldmacher from Cowan put out. It showed that the top 15% of companies in their industry who embraced ERP first outperformed the S&P by something like 35%. The same paradigm is happening right now in the same exact parallel with big data driven by Hadoop. But with ERP, the early adopters got to standardize some administrative processes and some sort of manufacturing related ones and in those verticals. And that created shared services and efficiency. But Hadoop by deployment, by workload type, gives you a new data platform. It's new data architecture. It's really the way to think about it. It's a new data architecture. Tell us the distinction and how that takes place. So you can go optimize certain cost points in other areas and in other areas you can actually change your business model and how a company interacts with or has visibility or insights into either their customer or their supply chain. Okay. And you supply the enabling technology. How thin or thick is the value add on top of that to deliver that solution? In some cases it's very thin and others it's very thick. Some are driven by us. Some are through the ecosystem and the partners. And I think that's what's really the really powerful thing is there's a whole next generation of apps that are being built that can take advantage of this next generation data set. I love the ERP example. Okay, and I've lost a lot of the TCP because I'm a networking guy too. But it enables this different levels of enablements. So let's the ERP. That was, and we brought up from theCUBE earlier and I didn't hear the comments. So it's kind of good that we're on the same page there. But let's talk about how long it took to do that. I mean, that was a gravy train. Deloitte and Tuge, back Anderson, consulting back into the big six accounting firms. That's right. Of course HP Sun all sold servers and mini computers at the day. But like, I mean they made a boatload of cash and it took years. Years to the port. Years to the big implementations and get value. And what's the timetable now? So compare and contrast because that's a good example. Time to value on ERP, time to value now. Yeah, ERP value tended to be years in some cases, depending on how big, but usually nine to 24 months or more depending on how big and complex. That's the great thing about from a Hadoop standpoint. Many, many, many of these proof of concepts saw ROI or at least IRR in four to six months. Yeah, real process change. Not just like small little POC, full change. It may start with cost optimization, but then through that they learn how to go through real process change or a fundamentally different way to interact with their supply chain or partner. And that could be accomplished in months. I got to ask you the startup question, the ecosystem question, competition question. You're a entrepreneur, great competitor, great strategist and watching you guys make your moves has been fun. But now with the shift to have an inflection point, it's a land grab for the big boys, right? And the ecosystem has to fit into that. There's some drafting going on. Certainly the big guys will take their share. Some will go out of business, some will stick around. As John Chambers said, be disrupted or be the disruptor. How do you compete? If you're a startup out there, this white space to play in, you got to land on a narrow position, sequence to a broader position. How do they do that? What advice would you give them? What do you see? How do people stay alive to fight another day, generate cash? What's your take on that, Hulse? It is a land grab, there's no doubt. But it's also, we're at a point where the opportunity is there to create value and to have a very definitive value proposition. And to be very, very focused on knowing what space that you're focused on and what are the problems you're solving? What's your model that you're going to go do that within and what's your value proposition around it? Yeah, like Trasada, I'll be met at. He's no outside funding. He's got a great business, fraud detection, retail. He's deep in expertise. That's hard to duplicate. It is very hard to duplicate. And that was a question I wanted to ask. If you generalize from that into the classes of partners that would take this new data architecture and apply it to specific solutions that deliver that sort of value, what do those partners look like and sort of where are they growing up from? Well, you're going to see some that retrofit existing applications to leverage a broader data set driven by Hadoop. Or, and then you're going to see others that are just going to extend their existing platforms, leveraging Hadoop to bring more data, exponentially more data under management, make that transparent. You've already seen that movement happen. What you're going to start to, and we're seeing it today, is there's a whole next generation of applications that are emerging, that will be leveraging Hadoop as a platform to do next generation analytics. That'll do next, and then you'll see the same thing occur for more horizontal and vertical based applications. Example of that's going to be, you'll see a whole next generation of supply chain that's happened. When you look at the traditional enterprise based apps that are based on pure transactional data, and again, supply chain's a great example of that, what Hadoop does is it allows the enterprise to change their business model from being able, from being post transactional, from being reactive post transaction to being interactive pre transaction. That's what Hadoop principally gives a shift in model. And those next generation applications will be the ones that now can be interactive pre transaction versus the old school enterprise apps of. But the executive who writes that check for that value, who's reaching him? Is it you in conjunction with this new class of partner? Is it the new partner? Like with ERP, it was, you know, often the Accentures that went in and said, you know, we'll re-engineer your back office processes. How do you, it's not the IT, POC, it's the head of digital who is often in a different organization. Or in some cases, it's the CMO. Oh, yes. Right, in this case it's the CMO that they want to get a 360, they want to consolidate all the channels of interaction that they have with their customers and consolidate it in a 360 degree view. Another very big and powerful use case is where you have very distributed enterprise and they probably have at this point multiple instances of ERP and therefore financials and therefore GL and they want to consolidate GL into one common system of record. And they're leveraging Hadoop as that aggregation point. So we're seeing a refactoring of many enterprise apps that at the same time a new generation of applications emerging and in other cases, refactoring of those enterprise apps with the Duke. So it's just, that's why we're so, it's a big explosion of all this. I'm kind of torn emotionally because my prediction was consolidation in the Duke space which you kind of look back over my shoulder here, you'll see it's not consolidating, it's jam-packed. However, the trends are shifting to an inflection point. I think that's a nice crystallization because that is a shift and an inflection point at the same time which is a unique point of the curve, right? So I want to get your thoughts on where we are right now. I'm just some highlights from the keynotes. You said you compared Hadoop Progress to where our DMS, our DBMS was five years ago, but much faster on ramping. Five years into our DBMS, I'm sorry. Hadoop is the progress is to where our DBMS, five years in. Yes. So we're five years in equivalent. So full pedal to the metal adoption on ramping, faster acceleration compared to that. And what the Gardner data told us was that today, if I remember the numbers correctly, 26% of the enterprises that they survey have already made some form of an investment in Hadoop. And then another roughly, it was either 12 or 14% in the next 12 months we're going to go and then another 7%. I think it actually is accelerating faster than that and will accelerate faster than that. But what that says empirically, if you laid that adoption rate per Gardner over the Jeffrey Moore crossing the chasm and early adopters, early majority, but it says- It qualifies quantitatively as a crossing the chasm moment. We have crossed the chasm then and we're in the very beginning phase of the early majority and that's where- Well, we'll compare the Gardner data to our data, but we'll get, that's a whole different segment. What would you call the bowling alley, the one that get, that the application that got you across the chasm, if you had to pick one, because he always says that on one. Yarn, yarn. Yarn transformed the Hadoop architecture and it exploded the value that Hadoop brings because it was not constrained to batch platform, single data set, single application. It opened it up to be able to bring all data sets centrally and being able to bring simultaneously batch interactive in real time. And in simultaneous with that, driving the security innovation, driving the governance innovation, bringing all the operational capability through and borrow and to be able to operate it at scale. So, some of us talk about commoditization, we talk about all the time, commodity, open source, it's all open. You could argue commodity differentiates happens on top, so I got to ask you your comment about putting Hadoop in the enterprise is like putting electricity in the home for the first time. So much is possible, says Rob Bearden on the keynote. Absolutely. So, let's talk about that. What does that mean in your mind? What do you mean by that statement? I mean, first of all, electricity, we all know what happens with the goodness you could read. It's good, but it's a utility. It's a utility at the ground level and what you do with it, expand on that thought. And it's a great example that we use at Hortonworks a lot because it powers the art of the possible. Without it, the infrastructure can't work. But with that, you get refrigeration, heating, cooling, lighting, all the benefits that come with the creature comforts. In addition to that though, it allows all the appliance manufacturers to innovate and leverage that utility to bring on refrigeration, to bring on a stove, right? To bring on cooking devices. So, follow up question I want to ask you is a debate that we have not in this answers. No answer, right answer depending on your perspective, it could be either way. Analytics, a process or a product? Oh, wow. It's a product that rationalizes and makes a process efficient. Good, okay. What do you think, George? You think a process or product? There's no answer because you could argue both sides of the coin. I was wanting to go down a related line of thinking which is Hadoop has sort of fostered an ecosystem of innovation that we've, certainly in data management we've never seen before. But the flip side of that is it's hard for the vendors to get their arms around all the, to turn that ecosystem into a coherent product. And that exposes some complexity along with the innovation to administrators and developers. How do you grapple with that, you know, it's good on one hand and it's a challenge on the other. Right. Well, what we can't afford to do right now is slow down the innovation. And that's an obvious comment because we want to continue to drive it forward. What we must do, what we must do is bring predictability to the tech and the release and landing. One of those mechanisms is through ODP, right? And how we have a common core that everyone, and that's the whole purpose of ODP is to have a common core that the ecosystem very reliably and predictably can certify they're fillin' the blank and it will run across that. It's gravity for everyone to do SLAs. I mean, IBM, HP, these guys, they have big customers that don't move as fast as some of the code revisions in open source. So I can see the rationale there. So my question is, is ODP a yarn-like effect? Is it going to coalesce and galvanize and create some gravity around that flywheel? We have work to do. And if we do our work correctly, it can. But we must execute it. And the ODP consortium is very important. We continue to add value and we continue to drive the sort of limitations. You feel good about it right now? You feel good about ODP right now? I do, I do. I think it's on the right track. I think the right commitments are there. But it's like anything. We have to continue to execute. I think the other things, George, to go back to your question, is while we got it, while we're innovating, we must provide better tooling. And it's going to be very important that we provide tooling to create better ease of use, more simplicity, more predictability, make it very intuitive to interact with the platform and the tech. And that's our responsibility as a community to do. And certainly from a work-work standpoint, we take that as our obligation back to the community. You're wrangling all the projects around a UI for administrators, maybe a better UI for developers. We want to see them boring, be able to do things that evolve it and make, for example, an upgrade cleaner, faster, simpler. And there's just very little to have to think about or do and do it all in real time. I have to ask about Spark because by design, it's integrated and sort of each piece reinforces the other pieces and it looks like the Databricks is actually trying to abstract it to the point where you interact with all the different workloads with a GUI and the platform disappears. Very immature, many years behind in terms of hardened. How does it fit in when you talk to customers, with your use of Spark and the Hadoop ecosystem? We're a big fan of Spark. We embrace Spark fully. We just had some releases out this week that fully embrace Spark. It's part of our enterprise stack. And it is that because of Yarn and that it's completely integrated and we put a lot of hard work into making sure that that's an enterprise quality integration and it's interoperable with the rest of our stack via Yarn and we're super excited about being part of the Spark community. Rob, I got to ask the public IPO question, being a public company. You look younger, you look at, I mean, you have new logos coming. You had one quarter, you know, and then you had some great quarter, last quarter. We heard a lot of logos this quarter, so performance-wise, great changes. Looks like the deflection point's coming. Take us through that journey for you and the company. You're out in the open and the open source. Now you're out in the open as a public company. What should Wall Street investors understand about Hortonworks, around the ecosystem, about your business model, about this world we're living? Because it's not a cut and dry with a red hat for Hadoop. You could argue that's a simple bumper sticker, but what is the story for you guys? Yeah. Well, it's about our position in the Hadoop community without question or doubt, and it's about Hadoop being transformational and to the next generation data architecture and the growth that's going to come inevitably from that and participating in that growth with us as an investor, what? So that's the answer to the, how the investors should think about it, but why this was important? I think it's the fundamental question. And that we knew we were coming to this inflection point. It was very clear to us that it was happening. And as the enterprise is going to make a very big next generation data architecture bet, it's going to be very important to them that they not only have the transparency through open source on the tech, but through the public market, they have the transparency financially of where the company is and what its financial model is and how it's progressing against that financial model. And it brings, I think, a high degree of validation not only to the company, but to the space and the tech. And that this space and this tech will generate, I think, multiple public companies because it's such a massive opportunity in value creation. It's electricity. Yes. With the growth prospects. I mean, if I'm an investor, I'm like that sophisticated, not in the weeds. Where's the growth? Okay, inflection point. You can talk about that. I like that electricity is good. Then you go, okay, product ecosystem, it's open source. It's not going away, right? So I think, I mean, I think that's interesting. And I think, talk about the successes now. Products and customers. What's coming out of the customers that you're talking to? What is the big customer, because the market's in charge or the market's the teacher, as we always say, right? So what's going on in the market with the customers? Well, now they're very focused on how we can go get as much data under management as fast as possible. Because, and this sort of changes as you move up the early adopters to the early majority. It changes a little bit in priority. Are they pulling it from existing systems or new sources? Both. It's in some cases optimizing certain irrationalized data sets, but it's about then bringing the new paradigm data under management. The clickstream data, the mobile, the social, all the new paradigm data sets. Bringing that up, because by having that under management they've learned there's tremendous value and how to get that value very quickly and it's incredibly leverageable, right? And they see opportunity to bring new apps on board and to change how they interact with their customers and their supply chain. I mentioned, we mentioned earlier, John Chambers' last keynote speech at Cisco Live yesterday. Interesting thread he took was and he's obviously got the historic view of the industry being at Cisco over the years is that the shift with data is so significant because everyone uses the Uber, Airbnb and all these startups. But what the competitive advantage that they have is they use data and they use the connected mobile consumer and the data is the competitive advantage. So we just talked to Pivotal earlier. Their thought was everyone's moving to get the data in so that reinforces your message there. And then from that point it cannot be locked in. It cannot be moved into proprietary tooling. That's right, that's why an all open source model is imperative to this space, not fracturing. So what's next? What's going on for you guys? Tell us the next earnings preview. Give us some insight into them, of course. You know I can't do that. But I tell you what we're seeing is a big move that sort of triangulates the last question too. We're seeing now, and one of the slides on the keynote did IoT? Oh, I said that, yeah, go IoT, tell us what IoT is. Or has Hadoop helped IoT force? At the end of the day, it didn't matter because there's massive value at that intersection, right? And so what it's done is it's fueled both. And I think you're going to see very, very significant opportunities for us and many others in the IoT space in general. And then the IoT is huge. I interviewed the CEO of a Rubber Wireless. He just got bought by HP and I asked him the question about they did the Levi Stadium Wireless, which they do six second replays after a touchdown from every camera angle to the app on the phone. It's amazing. And I said, what's this mean? Because I was trying to bring the car driving, Apple Watch kind of metaphor. And he says the difference is the network is interactive, not passive. So you can apply that to data. So you were saying earlier, that is really the interactive of the data. Free transaction, right? In our world with data, in real world, the enterprise, we now can let them shift to an interactive with their customer before at the moment, at the certain instant moment they detect that there's even a hint that they want to go into a transactional mode or an evaluation mode. And they can then shape it, influence it into the right configuration, the right product mix, probably at a higher margin. And by the way, we had many customers doing this in real time today that have ROI that are absolutely incredibly cool. You know why I'm so excited to be covering the space every show we hear is that every day, every year it gets bigger and better and as soon as we'll be like on, in need of Levi's Stadium. But when you think about like the real world because you mentioned the healthcare, Levi's Stadium, self-driving cars, the data precision at real time cannot, it has to be five nines. So ROI is already built into a self-driving car. If it doesn't work, crashes. If someone doesn't get the healthcare information at times, someone's live is at risk. So now you're getting into the real world examples. And to me, I think as someone looking at the opportunity would be like, this is so early geeky, still geeky phase, it's got a long way to go. It does, but there's so much downstream opportunity that it is going to absolutely happen and happen in a very big way. We're getting the hook here. I'm getting the hook from Leonard. Don't throw water at me. Final words, share with the folks out there, what's going on at the show here? People who aren't here, what's the quick summary of the vibe, content, summary? It's a great, if anything to be wanted, you want to understand or know about Hadoop, you can come here and find it. The partners are here. The track with the content about anything and the technology you want to know about is here. Any of the vendors who provide value or solutions around Hadoop are here. All the key people in the community are here. All right, Rob Bearden, the CEO of Hortonworks here inside theCUBE, sharing his perspective. Busy day for me, still on stage, giving the keynote. We're at an inflection point in the industry, enterprises, adopting in spades. That's my words, not his. But electricity is out there, that's the Hadoop. Great stuff here in Silicon Valley. We'll be right back more after this short break.