 The Cube at Hadoop Summit 2014 is brought to you by anchor sponsor Hortonworks. We do Hadoop. And headline sponsor, WAN Disco. We make Hadoop invincible. Okay, welcome back everyone. We are live in Silicon Valley in San Jose for Hortonworks Hadoop Summit. Actually Hadoop Summit really is shepherded by Hortonworks. And we're pleased to have another Hortonworks executive on Sean Connolly, vice president of strategies. We've got the official title now, still. What's your new strategy? VP of strategy. This is the Cube. I'm John Furrier with Jeff Frick. It's looking at Angel Wickey Bond the Cube. We're here to talk about what's going on in the strategy. So Rob Beard was just on me, had heard about him before that. Kind of moving up the chain of command here to the guys in the closest to the street. Still, we call them the customers, right? It's not like, it takes us the other way around. It's not like how you look at it. Exactly. That's right. Everybody likes talking to customers. We view the people close to the customers that's really valuable. That's great. We talk to a lot of your customers. One of the things that's striking about this event is two things. One, you guys lack of bow guarding the stage. You guys have opened it up to the community. It's not about, it's not a Hortonworks commercial, right? So this is all about the ecosystem. And it's been very positive. People social, a lot of collaboration. You've got big whales here, like IBM, Cisco. You've got series C funding. We started on a platform last night. They're growing their business. Awesome. And then you got startups. All of them are doing business. Yeah. They're going to do business now. It's like a market, right? So that's one vibe. The other one is the success of the platform. The new stuff that's coming out of Yarn and the data platform. The pace is picking up. You're starting to see things coming together. Yep. Of course, by design, from your perspective. But take us through as a strategy. It's playing out, same message from you guys. Oh, we're sticking to what we said when we started. But really, what's going on now? What's inside your head around this marketplace? Well, it was funny. One of the themes I wanted to make sure we got out of this event was Enterprise Hadoop in action to really put a face on the workloads that are coming in on the platform. And then also, you had a keynote earlier this morning where I talked about just two or three years ago, it was like there's all these components with these animal names and that kind of stuff. There wasn't really anything to help it together. And sort of the notion of Enterprise Hadoop and a range of services that make it palatable to enterprises. I think you're seeing sort of the fruits of that labor as the market matures. We're still early in it, but just the household names that are here that are talking about like Sprint had an excellent breakout session yesterday where they talked about how their journey went and how they really have a nice best practice for enabling new workloads to come on and a learning environment to encourage others to innovate in and around data, right, to drive their business forward. And we're getting that, because we're getting a lot of customers on. So there's customer stories in the keynotes, AT&T, everybody's talking about the AT&T example. We have the true car folks on Guy from the University of Calgary talking about. So. Two years ago, it was more about how many lines of code have we contributed? Yeah, we're a top contributor. What do these components do and how do I fit them together and use them, right? So fast forward to you. Lay out the phases of Hadoop, because Herb was mentioning that he sees it as Hadoop two-point, I really set the stage for that crossover point. You guys talked last year about crossing the chasm was it last year? I can't even remember what year it was, but it seemed like dog year. That was seven or 14 years ago, depending on what code count. So really, crossing the chasm really was right on track. So Hadoop two sets the stage for that. Now you see the true cars of the world and successes. But I want you to lay out the phases of Hadoop, so we can try to peg the inning. That's what we always try to do in the queue. What inning are we in? Yeah. Well, so, if you think about when it was invented, right? It was around 2005, 2006, right? We're almost eight years into that journey, right? New markets take about a decade to kind of play out, right? And I think we're at that inflection point, I think two years ago at the Hadoop Summit, we had Jeffrey Moore as a keynote for it. And that was as it was just approaching the chasm, right? So now it's starting to heat up and you'll see vertical solutions and those types of things. But we have another three, five, seven years ahead of us for this thing to stretch its legs in the market. One of the things I'm seeing early on is, it isn't just about how do I get my initial cluster set up. It's, I already have my cluster set up, I have a science cluster, I'm leveraging the cloud, and now there's sort of this sense of connectedness, or I call it tethered clouds, tethered clusters, right? That you really want to democratize the data into the form factors that make sense with the cost structures that make sense. I think that's going to play out next as well, particularly as Hadoop technically can run almost anywhere. Linux, Windows, OnePrem, Cloud, Hybrid, or what have you, OpenStack, there's a lot of choice. Now it'll be interesting to see how people connect all that with their existing systems as well as get the value out of the choice. So, Enterprise Hadoop in action, right? That's what you said. Talk a little bit about the kind of push me, pull me effect of people either having stuff that now they're excited they can actually do versus kind of the enlightenment once they do some things to say, oh, wow, now I can, you know, I never even really thought about going this direction. Some of the customer examples you see and how that's evolving within the customer set. Yeah, so it was interesting on Monday we had brought a variety of analysts together, Jeff from Wikibon was on as well as others there to hear some of the customers talk about their journey. And there was like, you know, like British gas, which is 250 years old and they're doing, you know, the smart meters in a million households and those types of things, but their journey is more classic. Whereas a true car is more of a green field, that not a lot of legacy to deal with and they can go all in immediately, right? And so, you know, one of the things that was brought up in that Arun Doug cutting panel earlier was, you know, each enterprise is different, right, so they're going to get on board with different, you know, sort of different drivers. I think the successful ones that we see are the ones that identify the applications that move the needle. And so all four cost savings and driving costs out of the infrastructure, I think it's underselling the platform. If that's all you do or that's all you're looking at it is, you know, I'm able to drive costs out. That's a benefit of deploying the architecture, but you know, there are new business opportunities just being unlocked. And you know, I think the gentleman from British gas, British gas, when he described it, he was like, or I'm no longer a traditional enterprise. Oh, you know, I look at myself as more thinking about it from a telco perspective and a similar architecture. So drill down on that use case, you said, so just to repeat what you said, folks that have been successful, one use case of many that's out there will identify the application and workload, I'm assuming application and workload first, then deployed? Yeah, so Sprint was a great example in the session that they covered where they, their Hadoop journey started from the business and the analyst side, not the technology side. And they were tasked with bringing a bunch of data together and it was a research cluster. Figure out the art of the possible and they came up with a variety of use cases. They brought one of the use cases to the CIO and said, here's the cost savings, but here's the use case would get unlocked. And that was the first man in on the IT operationalized cluster. They still keep their research cluster for educating the rest of Sprint, onboarding interesting use cases that will prove themselves that are worthy to move into their cluster. It's a great best practice, but it was very much lying to business and analysts driven from a research perspective. And then IT came in as secondary. Sometimes it's flipped, but it was a really interesting best practice. Which we hear a lot, right? It's kind of the age of not only the API, but also the age of the application and all this infrastructure that's put in place so people can develop, deploy and roll out more and interesting applications. Exactly, and it's a, you know, that's what makes sort of this enterprise Hadoop vision unique. And you know, again, that was a bit of what I hit in my keynote today was it's no longer just batch, it's interactive from real time applications. And it's really thinking about, you know, being sort of limited in how you can think about the new types of applications that you can unlock. Right. And you know, just translated into, you know, Red Hat today spoke, right? I think both the middleware and the app dev side of things as well as the infrastructure perspectives of Red Hat are relevant in this. Why? Because at the end of the day, creating a new generation of analytics, smart, intelligent applications, you need to speak to developers, right? And you need to enable developers, right? And so I think that's playing out really for our eyes. Yeah. Sean, talk about the yarn success you have and also the Hadoop data platform vision is playing out. Yeah. Pretty well. Talk about some of the feedback you've gotten and the success of yarn and, Ed, what surprised you about that? So. If anything. Well, you know, I've been at Hortonworks since late 2011 and yarn's always been in the mix for how we evolved it. But, you know, it's a little bit and came out in the narrative I covered this morning in that, you know, even when I first joined Hortonworks, it was literally, you know, a cast of characters, a bunch of animals in a zoo. And there was no platform, right? That it came together. And I think yarn, particularly last year and leading into last summit, provided a rallying cry as sort of a, you know, architectural center, if you will, right? To these components be able to come in. Not only in the open source, but I think the reception from, you know, from a SaaS perspective is how can they deploy their value into and get the benefit of the bargain of getting as close to and natively integrated with the Hoodoo cluster as possible. So I think one of the points that was brought up in Davenport's keynote was, this isn't an open source only thing, right? It's commercial ecosystem and a broader set of toy, right? It isn't just open source, but how can open source enable that bigger market? You know, if you look at successful platforms, I mean, in my age, I've seen the revolution, since I was a young kid in college, computer revolution, PC revolution now, through all the different cycles, the best success platform to ones that have helped make people money, right? At the end of the day, no matter what your quote, religion is technically, right? If you have a platform and you can enable people to be successful, TCPIP enabled Cisco and these guys create routers, you know? PC's enabled software ecosystem. So I think what's interesting about what you guys have done is you are enabling an entire set of characters and actors like startups, sub-series C financing, to get traction and grow using cloud and other techniques, series C funding companies like Platfora, who just blew through their B round, now they're in a high growth situation, right? Series C at the pre-IPO and now the whales are coming in, IVF Cisco. I mean, this is by definition, a massively developing market. And it's, you know, at times we sort of get accused of being more of the, we're less brazen about how we approach the market, how I view it and, you know, I'm a fighter, I'm from the sort of Philadelphia area, right? So I definitely have an edge to make. You know Adrian? But I view it as, this is, you know, if you're making a market, you need to be patient, right, and you need to have the internal fortitude to make the market. And the analogy I use is a brave heart with the blue whir paint is they're holding the line. You need to hold the line to make sure that you're going to make the market, the pie, as big as possible, right? And then I'm telling you, when we get a slice out of that pie, it'll be a fair share of that. It actually comes from my Italian grandma. I grew up with her living with us, right? So I learned to take my fair share but not hog the pie, right? And that gets to your whole point when that is sort of a visceral way, if you will, of describing what we're trying to do is we're trying to make the market as big as possible to have as much value for the startups as well as the whales, right? Your business model is to make enabling platform and you're big in your cut. But we're going to get a piece of the pie too, right? But it's a fair exchange of value for the world. You ride the long game, right? So you say, okay, if the ecosystem wins, you're big or lick off the cone or cut or share, whichever you want to use is big, right? So that's an approach, that's a business model. It is, absolutely. It's not to say saying that the whole world should be like that. I mean, others have their, other companies have their own. And there's an indirect and a direct. We definitely sell direct and service and, you know, large, medium and small enterprises directly. But we also do it indirectly through partners like Microsoft and Teradata and SAP. So friction with respect is an open source pedo, right? You say, hey, we can have friction. But self-respect drives it. When you have friction and no respect, then it's a dysfunctional environment. So to me, what I like about this community here is there are some use cases where, you know, the stragglers, there's some disrespect out there. Those people will die away. But when you have mutual respect and good positive friction, everyone will grow. And you look at the size of the market that Robert Deereen was just pointing out, companies like MapR, Hortonworks, Win, Disco, all are succeeding. It's not like, it's so much of a feature. So I think the theme here is, validates the point that we were talking about last year is, plenty of fruit on the tree across the entire landscape. It's not about fighting over territory. Right, and having done previous enterprise open source, you know, startups with J Boss or Spring Source and those types of things, I think, you know, when I look at, you know, I mean, at the end of Q1, we literally were selling GA software for six quarters, just six quarters, right? And so, when you go from five, you know, pre-GA customers to, you know, 310 at the end of sort of the six quarter of selling, that's a fast uptake, right? And I think, and we're early in the market, right? So I think we need to stick to what we do well and enable that. I got to ask you a question, because this is interesting. It was something that I haven't reconciled yet. So in the old days, you mentioned previous generations of open source distribution was everything. Yes. That was package software, right? In a way, red hat go back in the days, downloading stuff. Why is that not a big deal right now? Is it because distribution is frictionless? So the distribution variable doesn't seem to be a big deal in this market. Is it or isn't it? What does distribution mean? So, and I'll relate it to the open source space is, you know, from Hortonworks perspective, our platform is 100% open source Apache license, right? You can go to our site, download Hortonworks data platform as well as our sandbox, which frankly gets a ton of downloads with a lot of partner tutorials and that kind of stuff built into it. What we're fine, and you remove the friction on the consumption model. People will download it, prove the art of the possible. And if your technology is good enough, you've earned the right to have a conversation with them. And what we're seeing as it plays out, particularly the folks who download it and prove the art of the possible and focus on single apps is you'll get 10 to 20 node clusters where they will be just through the web, call for quote, right? We had no interaction with them other than providing solid tech that they can get frictionless, right? Not mother may I, not if I want added functionality, I need to get a license and those types of things, you remove all the friction on it. And that's been part of the model is if we could get the platform out in, you know, in as many places, also cloud on-prem appliance, we don't care, right? Also with commodity open source hardware, with open compute and these kinds of trends with cloud, there's no gatekeeper. Exactly. So, I mean sure, downloads are great, you can download anywhere. But you're not relying on bundling, so that gives everybody a free shot. Exactly. So to me, the distribution. So why not? It's a major game changer, because if there's no distribution involved, that means you can't compare old models to it on that distribution side. The consumption side, if you have free distribution, all free access, then the uptake is going to be the platform side. Who can enable me to be super agile and to build value? Exactly. And the other piece of the open source model is, and there's a balancing act between if you do introduce commercial extensions to that, just over the past 10 years, having done a mix of pure commercial, pure open source mix of both, is the commercial license elements tend to get a much lower attach rate, because the other stuff is like oxygen, it's free to download and use. And so there's an inherent friction built into how many people actually embrace the other. The fear of lock or it's less available as you will. I mean it seems to me that commercial is more about the comfort level and having a throat to choke if something breaks because I paid and I know they're going to answer the phone. Exactly. And that's really what commercial is, where open source is really about leveraging a community to drive innovation at a ridiculously high rate along a number of fronts. And what I think is everyone's going to make a lot of money and it's a pretty good size of pizza, but the much more compelling story I think is you guys are barely tapping into the value that you're actually releasing, not for the technology whales, but for the big whales that are innovating and using this software to unlock huge, huge amounts of value that makes your business and the sum total of all the vendors playing in the space, it's marginal, tiny compared to the value that you're going to unlock with the G's and the Boeing's and the GM's and pick your favorite Kelco and electric company. I mean these are huge amounts of value that you guys get extract with the perfect storm of the kind of technology infrastructure and ubiquitous networks all over the place and cheap sensors all over the place and now a way to capture all that data and now a way to analyze that data and now a way to drive that data down beyond the data scientists and a lot of people couldn't do A.B. testing. I mean the value creation is way bigger. And that's when Rob covered the sort of the size of the market growing to 50 billion across hardware, software, services. That's just one facet to your point was Cohen and company had a great way of sort of phrasing it a few months back and they were like there's a big opportunity in the enterprise to do platform space. The bigger opportunities for the ecosystem that can build on that and drive increased value. The biggest opportunity is for the big data practitioners and the people leveraging it to transform their business, right? So for every dollar spent on the Hadoop piece, there's $1,000 of return to drive the business, right? And that's really, I think the opportunity. I mean Rob stated an example of a process that cost $19 before now cost 23 cents. I mean that's phenomenal orders of magnitude. That's different. Well in the Wikibon survey that Jeff Kell is going to release soon, 59, 54% of the survey said that it's both top line revenue and cost savings. So you have not only the dynamics of the disruption, it's both. So that's like, I mean to me that's the telltale sign. Exactly. And I think it enables companies to be data driven companies that just happen to wrap that data around a particular group of atoms that might be a car or might be a hydroelectric dam but at the end of the day it's a data company that differentiates them from the competitors in these atoms. It might be seeds that farmers plant and you want to analyze which portions of the crops, that's happening. I think you guys are exactly right. You know, analytics down to the seed level, for instance. There are new ways of thinking about what you can do with this technology. I mean, if you think about companies that have all this data and Rob talked about the volumes of data as keynote, that when the infrastructure gets set and you've got the containers and you've got the plumbing, if you will, the infrastructure of the data and now you have a software-driven platform, then the app size is going to absolutely explode. Then you're going to start seeing more true cars. And that's when you start thinking data driven as a very interesting piece, which kind of brings me to my final question to you, Mr. Strategy, man, the chessboard. What is the next big chess move for, excuse me, is it data virtualization? What are the cool things? Now, cloud becomes the intersection point now. So, cloud and big data are colliding huge. So the platform layer, you're going to have a network virtualization. Do you see some data virtualization coming? So, here's how I netted out. In 2009, when I was part of Spring Source and this whole Paz wave was starting, so the pivotal guys, I've no idea what that was, because I was part of that, is there's one thing to solve the agility on creating applications, but how do you solve the data problem? And in a Rune session yesterday in this keynote, we said the worlds of Paz and Hadoop were colliding. The collision's happening, you're going to see it, right? And so, aligning with things like Docker to make it the long tail of any apps that you want to package up and make it easy to deploy and get the benefits of the bargain of data locality, that is playing out this year and next year that's going to accelerate quickly. With your other point is, with the different destinations, the thing I covered towards the tail end of my keynote was the notion of tethered clusters. That is, you're going to have BI in the cloud for business users. You'll have archiving in the cloud, but it's active. You'll have on-prem. You need a sense of connectedness across all of these and that's where it's going next. All right, so awesome interview. Really appreciate you taking the time. Final comment, I'll give you the final word here. In your own words, tell the folks out there, net out why this point in history in the computer industry and tech industry is so important. What is the big, uh-huh, game changing? We are living in a collision of events where disruption is happening across all sectors, right? Whether it's data, mobile, cloud, internet of things, it's a confluence of events that there have been other ways before. We are in a unique position to see it all play out before our very eyes and it's really exciting. Mark Bajan says a 10-year maturation process and he thinks we're really early, very early and it's exciting to see everyone really kicking ass, taking names, doing good business, being successful. I think that is something that's exciting to me because at the end of the day, when you take all the hype and fun aside, you can see the work getting done, the foundation set, Hadoop is talking about business outcomes. You guys deserve- It's not tech for tech's sake. Listen to the people talking about the value, right? There's real value. It works, you guys really deserve a lot of credit. Congratulations. Thank you for being so open and transparent about everything you do. Thanks, John Jeff. I appreciate it. This is theCUBE. Of course, we're extracting the signal from the noise and sharing that with you. We'll be right back with our next guest after this short break.