 SiliconANGLE and Wikibonds. This is theCUBE, our flagship program. We go out to the events, extract a signal from the noise. I'm John Furrier, the founder of SiliconANGLE, and I'm joined by my co-hosts. Hi, buddy, I'm Dave Vellante at wikibond.org. It's our pleasure here to have Herb Cunitz, who is the president of Hortonworks. Herb, it's a real pleasure. We heard this morning, we've heard the last couple of days really, just it's been a love fest on Hadoop 2.0, and the great thing about this conference is normally when you're at a vendor conference and everybody gets effusive about a new product, it's like, okay, here's another vendor's product, but it's a community's offering, you know? And it's a lot of excitement, and it's widespread, so you've got to be thrilled with the reaction. We're super excited. And I love the term love fest, so it's a great term. But yes, we are very excited about the reaction because what we've seen is, with the emergence of Hadoop 2.0 and what companies are starting to look at, it just opens up a whole new set of use cases. And we just had a customer panel where we had five customers there talking through this. And probably the most common theme that they went through is what Hadoop 2.0 does, it does two things for them. One is it gives us this platform, and one used the term substrate, then I can run all these different workloads on. But even more importantly, a lot of the things they had to build around the platform, security, authentication, data lifecycle management is now all going into the platform. And the whole next set of users gets that out of the box. Well, one of the big surprises for me anyway in the audience was one of the customers talked about basically replacing a lot of their mainframe workloads with Hadoop, which was shocking to me. You don't typically hear that to your point. A lot of this robustness being designed in and that's what people are expecting. It's truly helping become, Hadoop really become that next generation data platform. I think that's the most interesting part of Hadoop 2.0 and then most specifically on that yarn, which is the piece that really allows that platform management. Yeah, and you guys are putting forth that vision of, I'll call it the vision of coexistence, right? Although there's a segment of the population that is saying, you know what, we got a blank piece of paper, we're going to start from scratch. So how do you see that all playing out, the sort of coexistence crowd versus the white sheet of paper group? So I'll go back to the panel, to some of what some of the customers said today, because I think they're living through this. They have some great perspective on it is, the view is start small and start finding typically a new application or a new data type that you can start storing. And usually once they've done that and they start to get some value at it, they go through the usual things of funding and cultural implications and how they roll that over the organization, what that means. But pretty quickly it grows to then, how do I create some version of a data lake or a large set of applications where I can store this information? John calls data ocean. Data ocean, data ocean, that's bigger than a lake. You're going big. I do not like the lake analogy. I think ocean, because it's, you know, there's rip currents, it's just different dynamics. You've got the tsunamis coming in and, yeah. It's so challenging. It can be peaceful and... We've now gone global to the ocean. Sorry, data ocean. But what we see is that concept, but once they do that, then Hadoop starts to fit into their environment. It's not there to become the platform to rule them all. It's there to become a component of their next generation data architecture and that's where the companies settle out. And like anything, as the new toy comes in, they probably try to do too much and then they settle back to, here's the rightful space for what I can go accomplish. Herb, I got to ask you all, so the big news coming into the event was the financing, $50 million in fresh, fat financing as we said in the Cuban on the intro. As the president, you and Rob are kind of, and the team are building out the vision to stay independent and all that. There was discussions of M&A, you guys going to get bought out and all that. There's always those kind of rumors. Kind of puts an end to that conversation with the $50 million. It's a lot of financing. Certainly if there's seven and eight discussions, the price certainly went up given the VC dynamics that we all know. But in all practicality, describe what you guys are going to do with that $50 million. What's the growth strategy for Hortonworks? Obviously Rob was talking about the continued mission of 100% Apache, so he was really right on the same message. But operationally, as you guys operationalize your growth strategy, what's your plans and what are you guys going to do with that finance and sales and marketing and do more growth on the coding side? What's the use of funds of that capital? It's a great question. In some reason, the financing is fantastic and we're very blessed and excited to have investors and others who've come in to go do that. But another way for us, it's business as usual. It's almost a non-event, just allowing us to continue to grow the business against the plans that we've already had in place. So as we look at that, there's probably two core areas of where we're starting to invest that capital. One is on the engineering side to continue to build at the teams as we contribute to some of the open source projects like Project Knox, Project Falcon on data governance, Hadoop 2.0, Yarn. So be able to put more contribution into the community with more engineering talent. So that's part one. Second is definitely on the sales and marketing side. As we need to get more global footprint as expansion into Europe, expansion into Asia, having some of those capabilities and having some of those other places. And a lot of what we're doing with the partners, creating things like what we announced as a Yarn certification program, a certification program where we set up a cluster and give the ability for companies to test running on Yarn. So from an end user perspective, they know when they deploy, this has all been blessed and proven to go work together. Those are some of the core areas we're going to put that money in to continue to grow this. But again, in many ways, this was all part of the plan and it's really just business as usual. We're not trying to say it's going to change us anyway. This isn't, you know, we won the lottery. We're changing our life and buying a new house. Yeah, yeah. I mean, it's certainly not a pivot. That's what people would say. Hey, I mean, I don't like the word pivot either, but it's what you expect, right? I'll see with the market growing and the demand, engineering, field, essentially sales and marketing, global footprint and then ecosystem on the partner side. Exactly. What about the Microsoft relationship? This has been an announcement. We saw that Microsoft had a keynote. Explain to us what's going on with Microsoft. Microsoft's really interesting relationship. You know, everyone says they're all in on Hadoop. Microsoft's going all in on Hadoop. And what they're looking to go do with this is, you know, we had Quentin Clarke here who runs all their data platforms. And they're looking at Hadoop as a central component of that on-premise and in the cloud. And the relationship and the partnership we've set up with Microsoft and what they talked through today is, they now have two ways that you can work with Hadoop. One is HTTP for Windows, which runs on-premise. And then Azure, HTTP running inside Azure in the cloud. And depending what your usage pattern is and how you want to work with Microsoft. The second thing they announced, which was interesting is that System Center, which is what many Microsoft customers have invested in as their management platform, their single pane of glass for managing their environments, can now interface to Project Mbari from the Apache Software Foundation. And that makes it easier for Microsoft to say, I don't have to build all those capabilities in. I just interface to Mbari and all that gets surfaced. And if you're a Microsoft customer, single pane of glass, go manage and provision all your environments. Very simple, very easy. And also a huge Excel play, right? The biggest BI to what they said, the billion Excel users. And they gave a really interesting demo, global spatial demo of Excel. Which, of course, everybody uses, everybody understands. But I do feel like the industry is looking for more integration. And so, but Microsoft clearly had a good story there. Didn't you think? I mean, in terms of their commitment to this vision of Hadoop, I mean, he basically said flat out, we see this as the future. And if you think about the workloads that are out there, roughly 73% of all x86 service shipments are on the Windows.net platform. This opens up an entire new platform of Hadoop users that currently was not available because there's no distribution on Windows. That's now out there in the market, opening up a whole new set of opportunities. Herb, I want to ask you about you personally. Obviously, you're a tech athlete, as we say, Dave and I used to do it to tech athlete because you're out there. But you have a career going back to IBM. And IBM, very customer centric and very market driven, very, you know, in product focus. And then obviously going up in the open source community with Spring and now in VMware and now here at Hortonworks. What's changed in the marketplace from your perspective? Obviously, you know, from Spring and open source and just solutions in general. There seems to be the discussion here as enterprise grade. But the theme this year in terms of the emerging markets is solutions are critical. People want solutions. Certainly the OpenStack Summit was something that we heard from developers, but there was a solution business value conversation going on. So I want to ask you your perspective. What's changed in the market in open source? Not just the acceleration, but like open source has kind of a bottoms up organic kind of growth model to it, but also solutions kind of come later. But today there's more pressure on solutions. What's your take on the current marketplace with customers? What are they looking for for solutions and what's different now in open source than say previous generation? Great question. Great question. So I'd say a couple aspects to that. One, open source has emerged and matured to the point inside of enterprises and the world that it's accepted of something they can go do. And it's not this dangerous thing that you need to keep in the corner. It's accepted. It's a standard component of what they're going to go build on. So that part has evolved to the point where the maturity cycle is there, that companies are comfortable leveraging it using it. Second, and this really goes to the value of a community, is that community continues to grow and scale and continues to funnel work and back innovation back into that community. It just scales and helps the market grow out faster. And as open source is becoming more adopted, we're seeing that lifecycle accelerate where that may have taken five years before those things are happening in six months spurts when you start to see that happen. We had someone on stage from InMobi who talked about what they took of all the work they had done on serving ads and they created a whole data lifecycle management platform. They contributed all that back into the community and now we work together with them to architect that and put that in the right way. So that whole open source side has really changed and expanded. Second is really from a platform perspective. Companies are looking to put a platform in place and I think the data industry is one that for many years has grown and scaled and done extremely well. Hadoop is something that because of the economics and because of the capabilities that you now have in terms of what you can do, it actually will be an inflection point of changing this industry and this market. Doesn't mean the old market goes away but it opens up a whole new world and the whole new world is all these types of data that were just thrown on the floor can now be leveraged to make smarter more predictive decisions. And that idea of moving from reactive analytics to predictive decisions is really what's driving this trend. You know, we've talked about open source being kind of the ratification model now where now open standards are being developed by the crowd. So in the old days, IETF and these organizations would ratify the old stack, the OSI model. Now these stacks are developing and the ratification is the open source community. So we're hearing that from a lot of the top executives and tech guys all around the industry this year in theCUBE that these new standards are being developed by the crowd. And the open stack success kind of points to that even though it's really emerging and early. And the second thing is that Hadoop is a naveler, is a disruptive enabler in a good way for innovation. So I got to ask you the application questions. A lot of developers out there that are, I don't want to say waiting for the platform to mature but like they want more meat on the bone with Hadoop. They're like, hey, we want to make it easier. I don't just want to program software on top of Hadoop. So moving up the stack, what's your view on that application market still early? What has to happen to enable that, the new tsunami of applications that are yet to come? Great question. And I like the way you described that drive especially around enable, because that's part of what we believe is Hadoop can enable a whole new set of applications to emerge that Hadoop shouldn't become the platform that you build or write all those applications from. So if we can enable that to help emerge, think a couple of things are happening. One, with what's happening with Microsoft now, if you think of that entire VAR channel that is just out there building applications to go leverage data, they now have a new workload that they can go leverage around big data and that whole channel is now starting to think of how can they write applications specific to verticals and to use cases that they can then go put out in the market. That adds a whole set of application developers and ISVs that can go build out. Second is seeing a whole emergence of a lot of the analytics platforms that run on Hadoop and you've had a number of them here on theCUBE, right, of different analytics platforms that can now run and go leverage that underlying information. I do think the next stage is then how can you now write a set of applications that take advantage of that quickly and easily? Herb, talk a little bit about your vision. When you guys first launched, you put forth this vision that half of the world's data was going to be on Hadoop and as a data guy, I said, wow, that's impressive, right? And I think you had some dates in there, but I'm going to hold you to those dates, believe me, I know how hard predictions are. But how do you see that shaking out? I mean, is that still the vision that half of the world's data will be stored on Hadoop? Can you just sort of elaborate on that vision a little bit? So I think it's stored and processed using Hadoop in terms of what it can do and we firmly believe that. And when we first said that it was a prediction and it was a bold one. Love it. But you don't know if it's actually going to happen. It's like we're going to put a man on the moon. Exactly. But now that we actually have progressed through that, it's actually accelerating even faster, faster than we thought. And what's driving that is the market is starting to come together and say, if you're really going to get a market to function and you start to standardize on an open source platform and the rest of the community starts to contribute to that, you can scale that much, much faster and you really can get to half the world's data to be in process. Well, so the economics of that are just so compelling. When you look around this so flooring, what people are doing, you're talking about dramatically reducing your infrastructure costs over time and then investing in other areas of innovation around. Everybody talks about the 70-30 where most of our time is spending money on, keeping the lights on, we're not spending on innovation to the extent that you're not pouring money into what we sometimes call in the cube, the container. You're adding innovation around that. We're adding innovation around it. And in a lot of ways, I'll say the beauty of what's happening here, this is not a zero sum game market where this side grows because you're taking it all away from this other side. There's a net influx of new types of data that people want to manage to start to correlate against existing information. That growth of new information allows the existing platforms to stay and now you've got a whole new growth. So I wanted to ask you about that. So I actually was going to ask you about the zero sum game question. So your premise is it's not a zero sum game, there's a network effect that rises all tides, rising tide rises all, lifts all boats. However, having said that, if you don't change your business model and you today, let's say, are selling the container, or let's say you're an end user managing that container, if you don't change your skill sets or change your business model or change your value proposition, you actually, it may be a zero sum game to you. Is that, you buy that and what's your advice for people that are participating in those traditional markets today? I think if you do not embrace and start to figure out how Hadoop can go help you, it could be a zero sum game that you could get dismediated because there'll be other companies figuring out how to grow top-line revenue leveraging that pattern. So people need to go think of that. Second is, to a comment I made earlier, if Hadoop becomes the shiny toy that says I can move everything over, then that's probably not the right answer. And what'll happen is it'll settle back into a happy equilibrium between multiple platforms to say what's the right way to go store the data. And actually now, how do I share it across those two platforms? Candle, that's one reason we work closely with Teradata because you look at and say some of that information can't come from the data warehouse and move to a due. But in many ways what it is is actually where should the data reside and how do you best manage that across that distributed platform of multiple types of technologies because they all have value if you do it correctly. Yeah, I think companies have been disrupted and past cycles have learned a lesson. You see a lot less head-in-the-sand behavior I think around Hadoop than you have in previous cycles, certainly the PC cycle and some of the even early internet cycles. Herb, we got a lot of short time here. I know you got a busy schedule, but I got to ask about the show here, I'll sleep about Hadoop Summit and Hortonworks' role in that and your take on what's happened here at the show. And then I want you to talk about the community. So really two questions. First is the show here. How's it going? What surprised you here? Anything that's popped out good, bad, that ugly that's been great or surprises that kind of caught you off guard or just what you expected kind of a vibe of the show. And then two, the role of the community going forward. Okay, so the show, we're very excited about how the show has turned out in a couple of ways. And now I'll liken this to, Merv Adrian did this in his speech that he put up the suits and the hoodies. And he talked about the difference between the suits and what they do in the hoodies who are doing the development side. And I think what we've seen in this show is it's moving from just hoodies to including the suits and it's starting to blend both sides. I think you see that in the emergence of the sponsorships of the number of companies who want to participate in this and how that's grown. In the growth, there's almost 2,500 people here at the show who are taking part of this and enjoying it. So when you start to look at that, the show has I'll say exceeded expectations and we're honored to get to remain as hosts, co-host of the show with Yahoo and effectively help act as ambassadors for the community. And to your comment on the community, I really think of that's our role is to be more ambassadors for the community to help say, how do we help the community continue to advance what's happening? Can we take a leadership role in that and work with the community? And I used this analogy yesterday and I'll use it to the biking we talked about earlier today is, in some ways I liken the community to the Peloton and a bike race, right? That somebody can break away from the Peloton and can take a run and may have the feature and may have an advance they can go take off with. But over time, if that Peloton, if that community sticks together and the market doesn't fracture, they will catch that. And you know what? They'll probably spit it out the back. So if you do this right, the community is that Peloton that will continue to grow when it stays together, the market functions and the whole market advances even more. Talk about the ratification of de facto standards. It's always been kind of like, you know, going back to other inflection points that we can compare this to. And I've said on theCUBE, I look at the market today as kind of the aggregate of the PC revolution and client server kind of combined in a shorter time cycle. It's to me that big. And Wikibon has actually have data on that, but you got a massive inflection point. And during those times, you had that OSI stack and you had, you know, certain standards on the stack that were ratified, actually standards. And then there was always the de facto standard. I remember when TCPIP was ticked around, it was kind of poo-pooed on by IBM and other who had network protocols. But that became a de facto standard and enabled massive amounts of wealth creation and use and benefits. We talked about the standard bodies being the open source. What do you see happening in that regard? What needs to continue to happen or needs to change to continue to accelerate the community-based ratification of standards? Standards are much more interesting in this world than before where before you'd have a body of a small group that were selected by a larger group who would define what the standard was and how do you compete against conflicting interests and determine what that standard's going to be. Now with open source, it's far more democratic across a much broader sample set. And ultimately, the market is choosing. The market's choosing what that standard is based on adoption and based on where the value is. And that democratization is really what open source is driving. I think that's overall better for everybody because it's coming up with a better result in terms of what they need. And frankly, it's actually inventing it even faster. Yeah, and one thing we saw at OpenStack Summit was the contribution model really becomes key. And as they said, vote with code. And what was happening at OpenStack and why it really kind of crossed over from being kind of more of a marketing hype was people really kind of retooled and got together with this kernel of group of people and said, hey, code will determine what happens, not just talk. And now you're seeing the convergence of communities like that, right? Well, you have the OpenStack community. You have what's happening on the Hadoop side. And now with Project Savannah, it's how do you help Hadoop become effectively the killer app to run an OpenStack to go leverage all that together? Final question for you. And then Dave might have a final question about it. I really want to get your opinion on this. Obviously, Amazon has made a huge move into the enterprise. How do you talk to customers about differentiating Hadoop versus some of the things that Amazon may do? So as companies are looking at Hadoop, we have this concept of data gravity, which is wherever data is born, it tends to stay. So if data starts on premise, it for the most part stays on premise, it's not going to the cloud. If it's born in the cloud, SaaS know things that's probably going to stay there. So the best way is how do you build that in a hybrid model? And that's frankly why we're working with companies like Microsoft and others, Rackspace, so we think have great technology and processes for how to do that effectively, and we can give them the technology to go enable that. Okay, here, the president of Horton works here inside theCUBE. This is day two covered, SiliconANGLE Wikibon. This is theCUBE. We'll be right back with our next guest. I'm John Furrier with Dave Vellante. This is theCUBE day two of Hadoop Summit. We'll be right back after this short break.