 Live from the San Jose Convention Center, extracting the signal from the noise, it's theCUBE, covering Hadoop Summit 2015, brought to you by headline sponsor Hortonworks, and by EMC, Pivotal, IBM, Pentaho, Teradata, Syncsort, and by Atunity. Now your hosts, John Furrier and George Gilbert. Okay, welcome back, everyone. We are live in Silicon Valley. This is theCUBE, our flagship program. We go out to the events and instruct the signal from the noise. We are at the Hadoop Summit 2015 event, live in Silicon Valley, by coach George Gilbert with wookiebond.org. Our next guest is Herb Kunitz, president of Hortonworks. Welcome to theCUBE. Thank you. Good to be here again. So, you're doing all the schmoozing. Saw you out last night at the Hortonworks first cocktail party. A lot of customers here, a lot of partners. Great event. How you feeling? You did bike ride this morning. What's going on? What's going on? Late at night, up early in the bike ride this morning. It was great. Feeling great. What's different about this conference this year than some of the other ones is, it's always been a number of companies you see here at the show floor and everything and the different partners and the different vendors, et cetera. But this time, almost 50% of the sessions are run by end users. So it's a lot of customers, a lot of customer stories, a lot of end users. And I think it's a reflection of the maturation of the market that there's a lot of customers with stories to tell now. And the sessions are interesting too, that I always look at the sessions. We had a talk yesterday with some of the analysts and guests on. You can always tell a show by the sessions. Managing multiple workloads in production. How to set up a POC a couple years ago. Now, production, high grade enterprise deployments are happening. The other observation that I'm seeing, I'm going to get your thoughts on this. A lot of big name enterprises are here. The names are bigger. People are saying they're getting more leads from the show. Is that because of the overall growth of big data or Hadoob? How do you guys look at this? Is this a big data show or is this a Hadoop show? So it's a Hadoop show, but Hadoop is part of the big data ecosystem. So you can't do big data without Hadoop. But if you're doing Hadoop, you actually want to do big data. You want to do something bigger and broader, which is how do I get value out of the analytics? I think what you're seeing on the show, in which a lot of the companies, the large companies you described, and a lot of them are talking companies like Schlumberger or Rogers or Verizon, different companies in the customer panel or Symantec or Home Depot or GE or Optum, United Healthcare Group. What they're all talking about is this transformation going on in their industries where the digital supply chain is getting mapped out and then different companies are saying, how do I participate in that digital supply chain and can I re-monetize my business in new ways to take advantage of that because I have all this other data available. And what you're seeing is the early adopters have done that over the last couple of years and now the rest of the early majority got Jeffrey Moore crossing the chasm on tomorrow and the keynote. And he's talking through, what does it mean to actually cross the chasm as you're in that early majority and why do these larger companies start to play? So I got to ask you the question and this has come up in the queue because since last night I'm all high-fiving each other. Yeah, we cross the chasm, everyone's like, rah, rah. But then, oh, they talked to other people. No, the customers don't see it that way. So I got to ask you, are we high-fiving ourselves when we've crossed the chasm or do the customers see us crossing the chasm as an industry? I think the industry's crossed over, in my opinion, no doubt. But have we crossed over from a customer consumption standpoint? I would answer it this way. Or is that irrelevant? No, I think it's actually a very good point because I think from an industry perspective, we would all say we've crossed the chasm, right? Not by a lot, but we're crossing in that early majority. And we're running, yeah. But if you ask the customer, what does it mean to cross the chasm? They're going to say they've gone through that maturity life cycle and they have the first project underway, they have the second, they have a shared service or data lake and now they have ways to onboard the business and new applications and bring them on and just start creating these new analytic applications. Most companies have not nailed that process. So they have to cross their own chasm. Exactly. At least we did a big one. It's a mini chasm, mini chasm. For those customers, is there like a common project or one or a couple common projects to get those first two before they get to the shared service? It's a great question because there are absolutely patterns that we're seeing. And we've stock ranked how we look at the patterns of the use case because that is literally crossing the chasm is sort of that one bowling pin to get you into the bowling alley. So if I take the first couple and stock rank, number one is things like ad serving. But how do I serve ads more in the power of one to an individual across, as opposed to spamming them across a broad roof? Right, two would be archiving or offloading other data sources, data warehouses, other things. And can I do that more cost effectively? Three would be building a predictive analytics application to go predict behavior like preventative maintenance. I give an interesting example. We were last week, we were keynoting at the TU Automotive Show and you'd say, well, what's that? It's telematics and automotive. And that, if you think of their digital supply chain, how do I capture all the data, diagnostic data off a car? How do I transfer it somewhere because the car is not going to be the computer. I'm going to analyze it somewhere else. And then how do I make sense of it and provide that data back? And mapping out that digital supply chain, you know, we announced a relationship with Harman who runs most of the head ends inside of a car and captures that data. And now we can go back to OEM or auto manufacturing and say, you're going to have these types of issues on maintenance of the cars based on all the data you're analyzing. You can provide the service until this individual car driver, they're about to have a problem because the intake manifold pressure is too high. That's an early app? That's a shoe. That's the type of thing that they're starting to do now, absolutely. That digital supply chain, now they can deliver a new service which is not on buying the car just because I want six cylinders in this. I'm buying it because I'm getting better service back. I love the digital supply chain analogy. Rob brought this up yesterday in his talk on theCUBE about comparing inflection points with the wealth creation, value creation. You know, I saw a TCPIP, he brought up ERP which absolutely changed the game during the mini computer days. I created a ton of money for people to make money, vendors, customers, people happy in manufacturing, CRM, it's all still existing today. So digital supply chain, I got to ask you, that is a really big deal. So how does a company succeed? Because Sean and I just talked about it, Sean Connolly, about the dirtying tools and platforms in this new era, because what you just described is I can come in with an app and be successful. And then platform builds underneath it and I have all this open source underneath it as well. So you can be successful with an app today and Hadoop has got a lot of use cases. So how is this new platform and tool and going to resonate in this new supply chain because is there one boilerplate? Is there a reference architecture? Every company's different, so their digital supply chain will hence be different. So what does the tooling and platform have to do for the customer as they retrench or retool or replatformize? So I would say we've seen this, we've seen history play over multiple times, right? Rob said ERP, right? I'm going to go back even further. Think back to the railroads, right? As the railroads were getting established. So what happened? First is everyone had a different platform, different size tracks. And what happened, you couldn't run the railroad on multiple tracks. In this case, the locomotive being the app because it wouldn't run. And you had to custom build the whole stack. And then suddenly standardization across the tracks came through. And then all these new apps, different types of locomotives, different types of cars, freight car, engines, et cetera, all became possible. Different payload data. Exactly, you're right on. This isn't playing out the exactly the same way. As the tracks are getting standardized, which means the platform is being established, now you can build these new apps on top of it to a consistent pattern. That's when the innovation unlocks in the ecosystem. This is awesome. I'm going right where I want you to go. So Herb's, Merve's report, half-class, full, all this new stat shows that 50% of the enterprise are considering Hadoop. So if you actually take the example of doing a little app on the standardization of say Hadoop, then all enterprises will eventually be doing it. So the numbers might be a little bit light. But there's a distinction between rolling out an app with Hadoop versus enterprise-wide. So we're seeing in the numbers, and from our data, and also Sean collaborated, as well as Del Vecchio did on the keynote, which is you guys are winning at the large enterprise. Explain that. Explain how on a huge enterprise process where the decision is company-wide, where there's been a lot of land and expand organic flowers, blooming, a la patchy software vision inside the enterprises. Which they rein in, and so why are you guys winning in that bigger deals? And is the battle ground the little deals or the big deal? Because now if you take that forward, I'm an enterprise, I've got nine versions of stuff going on, I've got to rein it in. Is that where we are, and are you seeing that? Because you guys are getting the numbers on the high end. Yeah, yeah, I mean, just look at raw metrics. A number of new customers were closing in a quarter. Last quarter it was 105, right? Quarter before it was 99. And while we're proud of that, the reason is probably three-fold as to why we're doing well in that space. One, customers want to work with something that's open, because they want to be able to participate and contribute to it. And they want to know that it's not a walled-off garden, but it's something that they can participate in. So that's one, they want it open, in this case open source. Right, second is they want more of a partnership. They don't want trickery too, right? So like they want to see everything, right? They want to know it. They want pure transparency, they want to see what it is, and they want to say, if I want to help, I have the ability to help you on that. And I don't have to do some special thing to do that. I just need to train my people. Second is they want more of a partnership. I think the industry wants less of a product and more of a partnership. And partnership's an overused term, but what they really want to say, can you come in and help me figure this out? And I'll give you an example. A customer the other day said, I actually don't want to go stand up a platform or product. I want you to deliver this to me as a service, which means it's outcome-based. And at the end you say, I'm successful because I have this outcome. That's what we're going to measure success. That's a part of it. We're both impressed. The Presto project is impressive because we were talking about that Presto project, how that came out of Teradata Labs, and it's like, why would you buy Teradata when I just want to do an experiment? I'll figure it out first before I double down. Okay, great. So back to the enterprise question. I want to drill down on one more point there. Why, though, are you guys winning these large enterprise deals? Company-wide, is it because of those three, only those three things? Or is it because that's the way the customer wants to buy? Are they buying one vendor? Do they want to have multi-vendor? Why Hortonworks? Why are you winning those top deals? So typically the pattern actually goes very traditional software of land and expand. Customer starts and they'll start small. They want to get access, success. They want a proof point. And they may actually have multiple platforms in place, multiple railroad tracks when they get started. Then they realize that's not going to be as efficient. They say, I need to standardize to one. And when they standardize to one, they say, I'm going to go bet on somebody. For probably the next decade to run my data architecture. Who do I feel more comfortable from a partnership? And if it's more open, do I have the ability to participate? And do I have comfort that somebody's not going to come back to me later and say, ha-ha. Now you have to go buy that. I think Red Hat really nailed that. I don't want to bring up Red Hat to try to compare you guys to Red Hat because I think that's an overused cliché of the Red Hat of Hadoop. I mean, I just said it. I really kind of killed myself on that. But they nailed the support by having the 10 year rail support. They support their software for a decade. Every enterprise I talk to says that's a lock spec because they love it. They like the support. Yeah, they want the support. They need the support. You know, and for us, as long as we keep leading and driving the innovation of Hadoop and this becomes a platform that keeps accelerating, people want and need that help, right? To stay current and help participate. But Red Hat actually is facing it now a transition where every enterprise needed their support and the network was brilliant. But when you're a cloud provider, you don't need that Red Hat network to update the bits on every running instance. When you're in the cloud, how does the model change when you want to keep the tooling so that it's really simple to operate and to develop against? Red Hat has a whole different value they have to worry about in the cloud. Yeah, I'd say we've chosen a model that says we want to make it transparent to a customer. They want to run on-premise, in the cloud, bare metal, virtualized, appliance, however they want to do it. Make it ubiquitous that it's the same bits that they can go operate on, leverage and get success from. So we're given the ability to go run in the cloud on our partners like Microsoft Azure or others where they want to go run their business up there. And that's fantastic. And they get the same level and quality of service that they would whether they run on-premise. And Red Hat can deliver that same experience. And I guess what I'm getting at then is it's more of an economics issue. Like would the economics to you be the same when you're putting it on Azure where the customer, well, where Azure can be more self-sufficient in updating all the bits and making sure it's simple to run and they put their own tooling around it to simplify it. Azure can be more self-sufficient from the way you describe it. Somebody still wants to leverage Hadoop as an analytic service, in this case, and go get value out of it. They typically want support on it and can they reprove support to Microsoft as part of our relationship on things like HD Insight, which is what they're running on in the cloud. Okay, so everybody, I'll ask you the inflection point question because Sean gave us his fuzzy math around Oracle. You know, okay, Oracle did this in five years, but it makes sense, I totally buy it, but the numbers might be different with accelerated economics or whatnot with the cloud, but the inflection point I buy is here. And one of the reasons why I think it's an inflection point from a different vector would be the cloud right now has exploded, not public cloud, hybrid, private, and the enterprise, which changed the data center. So I was talking with your VP of engineering last night and talking about some of the things you guys are doing, and then it kind of hit me with this question that I wanted to ask you. How much of an impact to accelerate the ramp of the inflection points? The inflection point is, where are we on this curve of the inflection points? So now it's a new curve. It's the inflection point going to shape faster at what angle, and we think cloud is going to be a big part of that because of the economics, because of the horsepower, because of virtualization, containers, a lot of good stuff's happening for app developers. That's good timing for what's happening in your ecosystem. So how do you see the inflection point going from a business and scale perspective? Let me answer the cloud question and come back to it. I think it may drive the inflection point. So on the cloud side, I want to provide choice as I go describe, and we see on the cloud this concept of data gravity, which Sean and others may have mentioned, that if data born in the clouds tends to stay in the cloud, data born on premise tends to stay on premise. So if you ask a large enterprise who runs most of their data on premise, they're probably going to keep it there for now. They may burst analytics workloads into the cloud and start to get into a hybrid cloud, but they're probably not pushing wholesale there yet today. Just for other reasons. But if their data's in the cloud, they want to run it there and they want to operate it there. So we're seeing it come from both sides and come together. But now I think to the inflection point of what's driving that, again, railroads laid out, platforms getting standardized. Now it's, how quickly will the market develop applications that can run on top of that platform, on top of the Duke, as net new analytics apps and take advantage of it? And that's what the end users, their customers want. Just provide me that. And they'll validate that with checks, signing, consuming the product, paying for that value. Which drives more usage of the platform. So that's the flywheel we're looking for. We're looking for the transition to consumption. Easy to consume, frictionless consumption. That's why I would look at, as a signal, I would look at things like today, like SAP is one of the keynotes, right? SAP owns the transaction data for most companies, decades of it. They're participating in this market with us now around how can they work together and take those decades of transaction data applications, apply it to the new signals of a Duke and provide a new insight to a customer? That's awesome. But does that gravity keep that data and the analytics that'll run on Hadoop, does it keep that on-prem because SAP's on-prem? Or is the preponderance of all that data that you're adding coming from the cloud? So I'll give you an interesting answer and I'll preview something because you'll see the customer panel tomorrow. We did a prep for it last night and I asked them that exact question as we went through it. Five companies, all five said, my data's staying on-premise. And we went through that exact question to say, would you move it to the cloud? Why not? You have this ability. Even if the contextual data, not the SAP data, but all the stuff you're adding. They said they would rather bring it in sight and do that here for various reasons, of PII, of privacy, of data protection, that's where their data center is, et cetera. And it could be just fun for them. It's just the comfort of on-prem, even though it may or may not be more secure. Exactly. Which is okay. I mean, that takes a security argument out of it because we're seeing some pretty good security in the cloud. I mean, there's some cloud stuff out there that the multi-tenancy has nailed some of the security issues, but that's the mindset of the customer. They're the buyer. They write the checks. I would argue that the data's just secure in the cloud as it is on-premise. Just to secure. But to your point, it's more where they're comfortable that they're doing it. All right, so one of the other teams I want to get your thoughts on, and you probably hear this because you're out in the field a lot with customers, is Ease of Use. We're hearing that theme here, Ease of Consumption, Ease of Product. Where's the white spaces that need to be filled to make the overall head-to-platform easy to consume and maintain the openness, maintain the ODP mission? All that stuff's got to kind of come together. What's the areas of work that you see that needs to be done now? Keep it, get it, make it easier and more elegant. Yeah, I think Ease of Use probably comes on three vectors that you think about. One is, how do you make it easier for developers to write to the application? Everything we're just talking about, ISVs, others. Easier to write, what are the tooling, and then can you certify to the platform? And that's one part of what we're working on in others. Second is, for the administrators, we make it easier for them to operate it and run it. One of the customers is going to talk tomorrow. He went through and he was asked, you know, what does it take to administer this? He said he thought about it and thought about it and said, you know what, I have 700 nodes and I have two people administering this. That's actually pretty good. Yeah, that's great leverage. Yeah, exactly, it's great leverage. So it's like, how do we make that even easier? Can we get it to one? Like, we make that even easier for them. So that's the second. And the third is more, how do you make it easier for what we'd call the technical decision maker to understand what they're getting and what the value is and put the parts together? And that's a lot of working with the ecosystem, the partners and everyone around that. So I got to ask you a final question. You got a growing team doing well, but good quarter, good logos adding, winning the high end of the enterprise. I feel good. Notice you got a new VP of engineering, Matt Morgan came over from HP Citrix, seen his name kicking around. Good team developing. What's depth inside Morganworks? It was the stats, employees, new additions, changes. Yeah, so a couple things. So we had Matt join us right from Citrix to go around product marketing and he's doing an awesome job, right? Going to build that out. Now we're about 750 people now. We operate in 18 different countries. So we've expanded all through Europe and Asia in terms of what we're doing there. We still follow everything around open source. John Christ is running international. So you have now international leader? Absolutely, you have the leader running international, running Europe and Asia and running that in teams in all those theaters and teams being field teams, support teams and engineering teams in all those theaters, all across the US and separated in different places across the US. So we're getting very broad, which is good. And this is again, how do you make those customers successful and go deep with them? When you look at the Apache Software Foundation, when the history books are written, what's your view, what will be said about Apache Software Foundation in terms of its impact and the technology industry? I think what people are going to go back and look and say the governance model of the ASF and what they've done has allowed an industry to flourish and to foster under a confined set of parameters in terms of how they do that. And it's allowed everyone to participate but do it in a controlled way that you've got a model for how to go develop software for the future. And I think you're going to see this continue time and time again with other technologies. Yeah, I think it's certainly changed the game. You look at all the players, EMC, IBM, you guys are in, I mean, vendors are playing with coders, it's all, it's really working. So I think they're... We are very deep with the Apache Software Foundation in terms of believing in its mission and working with it. And everything we've done is, how do we go do our upstream work in Apache? You know, now we have ODP as a downstream way of consuming that and packaging it to a standardized railroad track, right? But upstream, all that works in Apache. As Mr. Reardon said, there's a railroad and electricity, two utilities out there. It's good to be a utility. It's good to be a utility. That means everyone can just plug into it every single day, right? President of Horton works here inside the Cube. We'll be right back after this short break.