 everyone, this is SiliconANGLE, live in San Francisco at the Red Hat Summit. This is theCUBE, our flagship program. We go out to the events, extract the scenes from the noise. I'm John Furrier, founder of SiliconANGLE. I'm John with Jeff Kelly, filling in for Dave Vellante, analyst at wikibon.org, lead industry analyst on big data here at the Red Hat Summit, talking open source innovation. And our guest CUBE alumni, John Kreis, the vice president of strategic marketing at Hortonworks. John, welcome back. Thanks John, good to be here. At Red Hat, obviously we had a long great conversation with the CEO, all the presidents and vice presidents, all the top management. They got a spring in their step. I mean, the rhetoric that there'll never be a Red Hat of anything other than Red Hat is, doesn't certainly, doesn't include Red Hat. They're doing well and poised, positioned well, pole position for the cloud. We did talk a little bit about data. Now Hortonworks has a similar approach to Red Hat in the Hadoop ecosystem. Some say the Red Hat of Hadoop, which, if you look at it, you can say, hey, similar paths. Our models are same, our similar, yeah, sorry. Business models are the same. And the discipline required for that really is a long game. So I wanted to first ask you, there's a lot of activity in the big data world. I'll see Intel made a strategic investment in Cloudera at $4 billion valuation. I mean, amazing validation for the big data space. That's big news, that's got everyone's attention, it takes Hadoop to the top of the front page of the business press, the industry press. And here at Red Hat, back in the trenches, where all the work gets done, is the open source community. So talk about the Cloudera news, how that impacts your business, and how that relates to some of the things that are going on here at the Red Hat Summit. So I think just overall, it's a good validation of the market in general, right? That the large vendors continue to invest in the community, much like a lot of the partnerships that we form, the partnership we have with Red Hat, and other major vendors, in terms of making sure that they're investing at various levels, whether it's engineering investment or other, I think that that move was definitely a validation also of the fact of how important it is to have committers to the core of this technology. And really that it's important to be able to drive the next generation of this technology, move it forward for the enterprise, you've got to really be the company that can really innovate on that technology. And that's really what Cloudera- We had Doug Fisher on earlier from Intel, who's a group vice president and general manager, and he was very clear. Hortonworks is definitely very much a partner of Intel, and you guys have a lot of partnerships, you've been very successful with your partnerships. So talk about the ecosystem and specifically your relationship with Red Hat, can you just unpack that for us please? So first of all, it's a great partnership. We've been working since last year on OpenStack, integrating OpenStack with Hortonworks state of the art platform in order to enable Hadoop to be deployed in that infrastructure. And then in February we announced a deepening of the relationship. It's indicative of the kinds of relationships that we form with the major software vendors. And there's a particular simpatico nature to the way that Red Hat works with communities and the way that we work with communities. In other words, work in open source, upstream projects, take and develop and innovate that software, bring it downstream, test and integrate that together, apply it in a project. And then we'll go over to you for a second. Set out with open source. What does upstream mean? We've heard that before. And bring that down. Just explain that real quick. Sure, sure. So there are open source projects that a very broad community of developers are working on. So when I say upstream, I mean, I'm working in some Apache project. Apache Hadoop might be one of the core, might be Mbari or any one of the other open source Apache projects that somebody is developing and committing code to. That community, there's thousands of developers who are contributing code into that kind of open pool of capability. So that's upstream. Ultimately what Hortonworks does is we work in that upstream notion and then curate that down and take the most stable versions of each of those open source projects from the upstream, test and integrate that together and then apply a very detailed and rigorous level of testing and integration on all those and put it out as a platform. It's critical. I imagine there may be tension at times between the open source community and the different vendors at play that want to push one, you know, one particular project or feature over another. But that's part of your job and it's part of the business follows to balance those and package those up. So I'm sure there's a lot of interesting conversations behind the scenes. They do. And frankly, I mean, it is one of the reason that those, you know, partner with Hortonworks because we guarantee that the changes that they want to get will get committed into core Apache Hadoop. We don't hold anything back. We're not looking to lock anybody in a particular version of the platform. We want to make sure it's all out there for everybody to benefit from, which is a very key piece of our strategy and it's why they want to work with us because they know we've got the committers, we're working with the committees, we know the processes and they can be assured that enhancements they need will flow into the overall community. So why is it, can you expand on that a little bit? Why is that so important to the partner organizations? That those, that, you know, things that are important to them could actually put into the larger, you know, the core Apache Hadoop versus if they were to just do kind of on their own and have a little bit of a fork. Why is that such an important thing? Because they want to know that the code that they're working on, the investments they're making will be there and so effectively they want to know that the changes can get made, can get committed back to the core trunk of the code line regardless of what project it is. It might be Mbari, it might be Hive, it might be Core Hadoop. They want to know that that code will be there so that, you know, 10, 20 years from now that'll still be there. Like they don't have to worry about what's going to happen with Hortonworks or the overall community or how things will shift. That code and their investments are safe and secure. So that goes for the enterprises as well just to be clear. Well, absolutely. I mean, they don't want to invest necessarily in a platform that's going to not be around and not be supported. The ecosystem's going to change, there's going to be all sorts of developments from around the vendor, so they're basically trying to future-proof their investment. It's exactly, they're de-risking their investment. Right, which obviously makes a lot of sense. So let's talk a little bit about Red Hat and your relationship. I know that you've got kind of an integration on the storage layer, but what's interesting to me in addition to that is the Jboss part of enabling application developers. Because we're always here, hey, we're all the big data applications. Talk about how that partnership hopefully is going to enable more application development. Yeah, so, and I don't think we've had a chance to talk since we announced in February our kind of broadened engineering initiative to bring Hadoop to the open hybrid cloud, but that's really where we really just started to get a very strategic, I'll call it, in terms of how we want to enable new kinds of analytic applications to run on more of a complete platform approach. Everything from, as you said, the storage layer all the way through to, and what is a really super broad application deployment model through Jboss. So integration at the complete stack of Hadoop into both storage up into Jboss, and then of course into OpenStack for that deployment model gives enterprises and enterprise developers more flexibility and freedom to have kind of an open way to adopt and develop on these tools. And so what we just announced yesterday was an extension of that to OpenShift, to giving the platform as a service capabilities and make sure that Hadoop integrates deeply within that infrastructure as well. So it's just a further and a deepening of a relationship that really is already starting to bear fruit if anybody wanted to see demonstrations of what some of that capabilities are. We've been showing it down in our booth and in some of the sessions here. Let's talk about the fruit that might be coming on the tree or bearing fruit because it's really early. We asked every guest, certainly the senior and the geeks that were here about something that we've been putting forth as our vision and that is data first. Got mobile first, we've been there, done that. Now you hear cloud first, which is all the cloud wars. And now what we're coming on around the next corner is data first, which just hasn't really hit the mainstream yet. So the folks seeing this is kind of fresh kind of out in the bleeding edge. It's coming where developers are acting on data as a resource and the systems guys aren't used to dealing with data. Look at data as just a storage subsystem or something related to storage. But now we're seeing that. So I asked everyone the question, where does data first fit into the architecture? So I want to ask you what you think, one, about data first as a core component of the architecture, and two, as you talk to customers and partners, what are they, what are you hearing? Where do you see that data layer sitting? How does it going to fit in? Does it fit in nicely? Is it composite, modular? What are you hearing? Yeah. And actually it's interesting that you bring up that meme because I think we've been saying it wasn't Hadoop that disrupted the data center, it's the data that disrupted the data center. And it's really the realization that these organizations can now and want to capture new kinds of data and exploit that for new kinds of applications that they simply couldn't technologically or financially before. And so that's what Hadoop is enabling and Hadoop is kind of a data first architecture, right? It's from many different kind of ideas in terms of concepts. The idea that I can just land any data into the platform without doing any pre-processing on it whatsoever. The idea that I can apply and land multiple different kinds of data, not just one single kind, but video and log files, machine generated data and blogs and tweets and whatever I have, I can land it into this giant pool of storage in the Hadoop distributed file system and then begin to process and iterate on it. And as I heard somebody say, torture the data until it reveals its value. It's really what you can do in terms of getting the value out of the data. So I think Hadoop is kind of that and would enable that data first kind of mindset. Let me just capture it and then figure out what the value is over time. And that that could emerge that there is tremendous value or might emerge that there's none, but at least I can get it to do something with it. What specific conversations are you having with, if anyone said, hey, I see it fitting here. What use cases are popping out now as early use cases of the first generation? So many. I think it's, we see use cases across, firstly I'll just say, really almost every industry, if not every industry, whether it's manufacturing, oil and gas, you guys have heard this before I know and maybe some of your audience has as well, but the use is broad. And what we always tell organizations to do is start with a single use case regardless of industry that you're in and get that successful first. And it usually starts from our standpoint, regardless of industry, with a line of business driven initiative. So better service my customers or do a better job of capturing prospects or better predictive and proactive maintenance or better job of processing my healthcare data. Whatever it might be, it's a line of business driven to create a single analytic application exploiting this new kind of data, to use your term data first, to exploit this new kind of data in that application. So the line of business drives that, but then it can expand very quickly in a bunch of different directions. So how about Microsoft? They had an event today in San Francisco which I missed, Jeff Kelly went to, I believe you went to. So it was a great event today. I thought you meant probably 10 times, maybe at least. So it was great from that perspective. Did they call you guys out specifically? We did, actually Kevin Turner mentioned Hortonworks a couple of times in his piece of the presentation, which was great, so we appreciate that. It's a great relationship with Microsoft, one that we've had for over two years now. So around February, March of, announced that, started off in, to next to Windows, and then finally expanded where they have now products in market that are built Hortonworks data platform. So we've worked with them collaboratively on those initiatives to help them bring products to market. We also work with them on the Stinger Initiative. Microsoft was very involved in the Stinger Initiative, more than 6,000 hours of engineering time contributed to that project. They help both with things like the query planner inside of Hive, SQL server engineers, deep expertise in query planning, working on Hive and an open source project. Just not necessarily the relationship or things you'd expect, which were quite satisfying and really great to see in the community, as well as things like helping with the ORC, the optimized real column file format. So improving the compression, the on-disk compression, and the ability to read and process those files. So really they're very, very involved in open source. I think it's kind of eye-opening, Hadoop's one of the areas that they are particularly involved in one that we have a great partnership with. Yeah, you know, having been at that event, I thought it was a good event. You know, I think from my perspective, I did expect a little bit more from Satya Nadella in terms of kind of the larger vision. I mean, I think it was a solid vision. They talked about the platform and how they have the tools and technologies that kind of span from the infrastructure and Hadoop up through the analytics and visualization. But I was hoping for him to kind of inflame my imagination a little bit more. That's kind of what I was looking for from Nadella, from a CEO of Microsoft is they're trying to transition in this role. I mean, obviously they've got this huge legacy business. And I think from a practical standpoint, it's smart, they're trying to use their install base to kind of infuse big data into their customer base through things like Office and SQL Server and other things. It's a smart practical move. I was hoping for a little bit more, getting a little bit more fired up about the possibilities. You know, I think some of the demos they did were solid and practical, but not really kind of. Was it big data washing in your opinion, or is it legit? No, I don't think it was big data washing, but I think what the industry needs from players like Microsoft, from IBM, from SAP is really big vision. You don't think that they had that in terms of like the ambient intelligence? They did talk a little bit. I'd say they intrude on it a little bit. It felt manufactured thought leadership to me. When I read the Buzzwords, and the NBA, come on, insights, right, come on. I thought there was a little bit, he started off with- It sounded like they were groping for some thought leadership. That was my opinion. Of reading the post, I wasn't there. I think he scratched the surface. I think they can do a lot more. And having had Satya on the queue before, I know he's a smart guy with a really great vision. Right. You know, and this happens in steps, it's not going to happen all at once, but I am looking for them to show a little bit more of a grander vision about not just the platform, the technology can do with it. But they're not really correct in your opinion. Oh, absolutely. They're not way off. They're not off. They're on track. I think the platform strategy and messaging is right on. I just want to hear a little bit more larger vision and less about, oh, you know, we're going to make Excel a little bit more effective. But tell me how you're going to build new applications and do new things with this data. Because really that's what it's about, about big data is about, you know, enabling new lines of business, really disrupting existing markets. You know, it'd be a little hard on them. But I think- Well, I think you had a good point. I mean, one of the things I noticed with Microsoft, and this includes our friends at EMC, all the big companies, is developers want trust. And Microsoft's earning trust, you guys mentioned some of those announcements, but of course, if they didn't do that, I mean, people would still be skeptical. Could people look at, even if it gets a lot of skeptics, oh, they got my data, look at the nest thing that blew up in their face, right? You know, trust is huge in open source. There's a lot of land grabbing going on, a lot of data washing or so, so trust is a huge thing. Just open source, this is not new to open source, is it? The trust, the transparency. Yeah, no, I think that's, it's core for the success of any open source project, generally speaking, to have and establish that trust. Trust with the community, trust with the enterprise, trust with the ecosystem, just generally speaking. So, I mean, I do think that Microsoft is establishing that and continuing to work on it. And I know at least in their interactions with us, you know, we see that kind of being very open and honest and wanting to contribute on these kinds of activities. So, you know, I think it is broadly speaking, you're right, John, something that has to be established, I think Microsoft, to stay on topic, is one that's establishing their trust in this area. They're working towards it. I want to switch gears just a little bit. I mean, still now kind of switching more to the competitive landscape. Wanted to get to your take on a move from a competitor pivot that they announced, I think just last week or maybe the week before, that they're now essentially giving away their Hadoop distribution, Pivotal HD, and support for free. So, obviously they're putting price pressure by giving away for free on the rest of the vendors in the market. What was your take on that? And does that have an impact on your approach? I mean, what does it mean for hard work to bank? We don't really see Pivotal very much, I'll say. So, not really much pressure if you don't see them. Secondly, I think they're probably just embedding that support cost into some of the other products. I mean, they're doing a complete platform sale and they don't really value Hadoop. They don't really contribute much, if anything, to Hadoop. So, from them they're trying to sell you all of the application stack and all the other components, they're going to get their pound of flesh one way or the other. So, I don't really believe that it's free. You mean from their install base? I mean, from their customer. Or from the big data bundle or whatever it was that they called it, there was going to be some embedded cost because there's expense in supporting a platform. I'm not going to do that. Talk about Hortonworks for a second. Where are you guys now? I know you're still hiring, you're still busing out, you're parking problems, oh well, so I drive by every day. It's like, you've got a big building and not enough parking spots, but it was an update on the hiring situation, staff, et cetera. Obviously you've got a hundred million in fresh funding. Hortonworks is doing great. We had, we're continuing to hire fast. We added 240 customers last year. We got 250 and continuing to grow in terms of customer, we had a great Q1 and just continuing to hire at a very, very fast rate. There's 25 or 30 people in the new hire class going on this week, so it's really, really great. And it's partially just really because of the momentum and the interest we're seeing, both from through our partners and with our partners and then just directly with the enterprises. So from our standpoint, we're extremely happy and excited with the momentum. I think 2014 is the year that we see things going from kind of just POC to a lot, a lot of production with Hadoop. So a lot of enterprises are really looking to move things into and take those applications to the next level and they start small, but really when they get going, they grow quite fast. When Rob Bearden comes on the queue, we'd love to talk about some of the business model, especially Dave Vellante, loves to get into the business side of it around subscription revenue. That is a discipline and we asked the folks at Red Hat the same question we asked the president and the CEO, how do you stay disciplined to that business model? Because you could risk taking some of that heroin and get addicted to meth, whatever drug you want to. It's a little late in the day here on the queue. You go outside of the discipline, you can really get yourself in a bad position. So talk about your business model. Where are you guys, you have a subscription, you have some professional service, but are you guys still on track on that business model? Give us a quick update. Yeah, the short answer is yes, we're still on track, still focusing on having a subscription support business. Training and services are still a minor part of the business, still an important part. And we prefer to push as much of that to our partners, system integrators and others as possible. So why would someone want to buy a subscription service that you guys are selling? Just walk us through the customer use case because a lot of people, I mean, it's similar to, is it similar to Red Hat almost directly? Is it like, hey, I want support on Hadoop? Explain that. Yeah, that's basically the case. I mean, it's data infrastructure. So once you build an application which you're using to drive, either analyze a critical piece of your business or drive a critical piece of your business in some form or fashion operationally, you're going to want to have support on that, right? And that's the kind of premise around having 100% open source model. We don't need to create lock-in. The platform is of the value that organizations will want to have that subscription support and that's proven out to be true. We haven't needed to change or modify our model in any way and we'll continue to go get on that path. So it was the original premise, one of the original premises for the company and it continues to be true today. You know, I give Hortonverse a lot of credit. They are sticking to their guns. They haven't wavered at all in terms of business model. We try to knock them around too. We try to get them off balance, you know? We try to rattle them, but they just won't be rattled. Tough interviewing, but we're sticking to our guns. Yeah, you guys are hard to rattle. Hard to knock down. We'll go to the 12th round if we have to. Maybe there. You guys are doing a good job. You guys have been very good and open. It's been fun to watch the rise of Hortonworks when you guys came into the market. It's a short history. Seems like yesterday does. Not even three years old yet. Well, it comes back down to the discipline and you guys have experience with open source. We've talked to the management team over there. You guys get the business. And it really is, I don't want to say arms race, but you got to continue to focus on getting the code out there, get the contribution and have some tech geeks to do that. Anything new from Hortonworks coming around the corner? What's around the corner that you could share that's off the straight and narrow of your current execution? You know, I think we just believe very strongly in our model and are going to stick to our guns. I don't think there's sort of anything necessarily to talk about going forward. We just announced our 2.1 product at Hadoop Summit two weeks ago. I think a great place where you guys will obviously be is Hadoop Summit in San Jose in June. So, you know, definitely we're gonna look forward to having you guys there as always. I think it's a great place and a great venue to learn more about it. I'm sure we'll have some announcements at that that we'll be talking with you at that time, but you know, it's a great venue for anybody who wants to learn more about it. And certainly the funding market is really hot still. Venture capital is still in the enterprise. Lot of innovation left on the, a lot of fruit to be certainly coming off that tree soon. I mean, it's a lot of legs there. Tons of innovation going on in the community. I mean, you've got these things like Storm and other kind of open source projects, which are, you know, just still coming into the platform and making even more things happen, right? You've got the yarn is still new to the platform. So there's still many, many kind of innovations and integrations that are happening with yarn. And I think that's one of the things you'll see is lots of new applications being created, lots of new workloads being supported, all on that same platform. You know, we talk all the time privately and publicly about the bubble, especially living in Palo Alto, you kind of firsthand exposed to a lot of the frothiness. But if you look at the development market as a bellwether, and if you look at like the first.com bubble, there was a lot of hype, a lot of PowerPoint slides getting funded. And obviously that burst for all the reason, but still all those companies ended up becoming features anyway with the web. So, but what's different now about this innovation cycle or bubble is that certainly ridiculous valuations on the consumer side. But the enterprise side with the convergence into the consumerization of IT, really has a lot of meat on the bone. I still think there's years left on this current run, in my opinion. I would agree. I think that it's being driven again by the data to get back to one of the early things we talked about and data growth is not stopping, right? There's so much data coming from all the devices and all the different places that are generating data. That's an opportunity for enterprises to better address their customers, provide better service, do predictive maintenance. All of those workloads are highly valuable to the enterprise. Therefore, they're not going to stop investing in technologies like Hadoop and others. You know, we were talking, I was talking to one on one interview with Gil Albest from Factual, who's an inventor of AdSense from Google. He's been working for years now on this platform around getting location data. We're talking and I said, what do you think about the big data market? And he kind of rolls his eyes in our normal cube commentary kind of way. Oh yeah, there's little data, fast data, kind of the normal things that we would say. And then he made a comment that I found very interesting. He said, there's two types of companies. There's companies that are full of data or dataful and people who make infrastructure and software for those companies that aren't dataful. They're data vacuums, they suck the data in. So it brings up the point of this whole nother classification of if you believe that the data tsunami is coming like he just said, and we're just the beginning of it, people will be full of data. That's going to present interesting engineering and opportunities. So as people become dataful, that's going to be a whole nother ball game. I mean, I think it's a very interesting analogy. I would say we see that kind of thing happening within companies and that's what they say. We don't have a way to capture, we've been looking for a way, I'll say, to capture and process this data and that's one of the things that I do provides them, but because it's streaming in and spilling in, I mean, the term that you guys, I'm sure, have used and we've heard is data exhaust, right? That data was just falling on the floor. It wasn't being just, all of a sudden, you can actually, though, that would imply you're dataful. If you were throwing it away before, but now you want to capture it, it's like the guy who just gets so fat and big because he's got somebody saying, my final question should be this. So if you believe that Red Hat and all these companies are on the next edge of the next generation operating system. For the enterprise, for the world, distributed operating system in the cloud on premise, all new software architecture with virtualization containers, all that stuff. If that's happening at such a large, mega scale and with the validation of the funding that you guys are getting and the industry's getting in the data space, what inning are we in in data? How early is this funding validation with no anthem of data? Or are we even pre-gaming at this point? I mean, because that is a massive innovation space that's going to have to come very quickly. What's your take on that? I'd say we're no later than probably top of the first. I mean, we're definitely early on in terms of the technologies, the innovations. We may have the anthems and opportunities, right? I mean, just exactly for Hadoop as a technology has a long way to go. All these technologies can continue to evolve and just figure out new ways to exploit what they have. So, okay. Well, John Christ, Vice President for Chief Marketing of Hortonworks, give a quick plug. I'll give you the final word about Hadoop Summit coming up. Give us some quick dates and what we expect to hear. So June 3rd through 5th, San Jose, obviously our friends from theCUBE will be there. Definitely if you were interested in hearing the latest of what's happening in Hadoop, both from the technology standpoint, the ecosystem standpoint and from the customer standpoint, over 120 sessions this year. Bigger and better than ever. Look forward to seeing everybody there and seeing you guys there as well. Okay, that's John Christ with Hortonworks. This is theCUBE. Thanks for watching. We'll be right back after this short break.