 Live from the Sands Convention Center, Las Vegas, Nevada, extracting the signal from the noise. It's theCUBE, covering HP Discover 2015. Brought to you by HP. And now your host, Dave Vellante. Welcome back to theCUBE and HP Discover, everybody. This is Dave Vellante. Joseph George is here. He's the director of Big Data Marketing, sorry, Big Data Solutions for the Apollo Server Group. He's joined by Mark Lockbiller, who's the director of partner engineering at Hortonworks. Hortonworks is, of course, a company that HP has invested in. Very close relationship, gentlemen. Welcome to theCUBE. Thank you, Dave. Great to meet you. So, Joseph, let's start with you. Give us the update on the Big Data piece. We just had a great discussion on Apollo, generally, but the Big Data has been a tailwind for Apollo. What's happening there? Yeah, there's, you know, and for context, for those of your viewers that are familiar with the SL line of servers, SL4540 was our big 4.3U 60 drive server. Great for object storage, as well as for Hadoop. We've now announced the Apollo 4,000 family, so we went from the SL branding to the Apollo branding, but now we have a family of servers. The Apollo 4510, which is that one node that we've talked about, we've shortened it, so it was 4.3U. We've shortened it down to four, and for those that are going to miss that point 3U, I can send a blank or something like that if anybody really misses that, but the 4U has been very well received. We've added additional drives, and we now support eight terabyte drives, as well, so with a 4510, you get 68 drives. You can put 5.44 petabytes in a rack now with that kind of density. We have a Hadoop flavor, as well, with three nodes that handles the triplication that Hadoop does when it ingests data, and then we've also announced the new 4,200. This is a brand new product. It's a 2U server, a standard depth, standard height, similar to our 380s in a lot of ways, except that this guy can hold 28 large form factor drives or 50 small form factor drives. It's the densest 2U server on the market right now. So, I have to ask you, Mark, in the early days of Hadoop, it was, okay, we're just going to use commodity hardware, Seagate, Distrise, we're going to put it all together, and then, all of a sudden, the enterprise caught wind of Hadoop, and things changed quite dramatically. You know what I'm talking about. In the early days, you'd walk around Hadoop World, Hadoop Summit, and you didn't see any of the big storage and server guys there. Sure. What's happened to change? So, something very, very exciting with the big data reference architecture from HP. We've been able to actually build an architecture that fits perfectly with Hadoop too. We've been able to separate the compute from the storage with the 40 gig backplane, which is incredibly fast. It's fast, you know, the big thing with Hadoop was, move the math to the data, get it really close, make it really fast, and distribute it. This 40 gig backplane, it's faster than the traditional architecture. Now, we're actually working two architectures with your teams, with the EG server teams. We're working a traditional architecture, and we're working the big data reference architecture with Moonshot supporting the app side of the house, where we can elastically scale just the yarn application side. It's separated from the storage side. Yeah, so I mean, that's the profound idea of Hadoop was function shipping. Ship five megabits of code to petabyte of data, right? Yeah. But the challenge for practitioners has always been they can't scale compute and storage independently. That's right. So that's the problem that you're solving there. That's right, and actually it's interesting, and this is what's really exciting about being a part of this at HP, is we are developing really innovative hardware. We're also contributing to furthering the Hadoop community and the project as a whole by challenging some of, you know, Hadoop's not that old, but challenging some of those old legacy thinking about you always keep compute and storage together. That's just what you do. That's always the first conversation I have when we talk about this, but now that we've got something that's very purpose built for storage, like the Apollo 4500. It's something very purpose built for compute. You can put all the map reduce functions there on Moonshot. You can put all the HDFS functionality on the Apollo 4500s. And what we're finding in the yarn labels project, which we HP servers was fortunate to contribute to the project. So we've actually got developers in the server group that have brought this code and contributed to Hadoop. We actually were able to put all these things together with that 40 gig networking. We found something that we could do just as performant, but in about half the space. In some cases you can see 2X the read performance, 2X the write performance. So we've stumbled upon something by testing and by being very eager to try some new things out in partnership with Ortonworks. We've come across some really cool innovation that we can share, you know, with the community. Yeah, we've actually been shown on the floor here at the show, a section of the cluster running H base, a section of the cluster running Hive and a section of the cluster running MapReduce, all separated, all throttling full speed and performance has been outstanding. Yeah, and it's important to note, we're, you know, now we're getting to more of the color of data, right? Not all data is the same. Some's hot data accessed a lot. Some is data that's not accessed, it's cold data. And now you can start being more intelligent about what your Hadoop cluster looks like, right? If you need a much bigger storage presence in a particular workload, there's no need to add a compute and storage, compute storage block, you can just scale your storage in the same way for compute, right? So it's really more an innovative way to start doing Hadoop now. You know, it's interesting, Joseph, you're saying, you know, Hadoop's not old, not that old, but it's kind of getting long in the truth from the original sort of, you know, instantiation, right? And if you look at what's happening inside of the Googles of the world, it's, they're moving to, you know, new paradigms. And of course, you know, things like Yarn and Spark. And the other amazing thing is the way SQL has come into Hadoop. And that really changed things quite dramatically. It opened up a whole new world. And now you say all these enterprise requirements come in because a lot of the, Hadoop's complicated. You know, we know, we use the stuff. H-Base, ah, what a nightmare, you know? It's really hard. And that's why, you know, our data shows for every dollar spent on a big data project is for a year ago it was only about 55 cents on return and now that's starting to finally change as some of these new capabilities are coming in. So my question is, so how are you driving that sort of evolution into the enterprise? Yeah, great question. In fact, what, you know, you mentioned a few of the big players, the Googles, the Yahoo's that were spearheading these projects. And you know, H-Base has a history of working with some of the biggest companies and then figuring out what those learnings are and how we bring that to the enterprise and the other folks. And we start looking at, you know, how do we avoid invention when you're really trying to solve a problem, right? If you're trying to figure out how you can see more patience or when you're trying to figure out how you detect fraud, those are the problems that you want to solve. You don't want to solve how you actually build a very superior Hadoop cluster. That's not the problem, right? So what we're trying to figure out together right now is how do we together solve some of those problems for these enterprise customers, right? How do we figure out integration to LDAP? How do we figure out what applications are already in the environment? How do we figure out a services and consulting engagement so that we can have these whiteboard sessions? We're finding that these customers more care about how they solve their problems, what their challenges in their business. What we want to do is make it a simple way to go and implement Hadoop with the best servers around and let our customers focus on solving the business challenges. And I think that's what's different here. You think when you look at the really big players, you've got very advanced data scientists, you've got a lot of those big staffs of people, frankly contributing back to the project. As we go more into the enterprise, you're looking at people that are looking for solutions from vendors like us to do a lot of that already, some level of services for customization and integration. But that's what I think we're bringing the value is how we bring something that's very powerful at the higher levels and start making it consumable for the rest of us. Well, I mean, the Hortonworks culture, I mean, the whole open source culture has always been about collaboration, but Hortonworks in particular emphasizes the relationships deep. I know I talk to Sean Connolly about this all the time. You know, it's not just, we don't want to just do Barney deals and press releases, even though there's a couple of those, but we really want to emphasize deep engineering. Now, of course, HP's made an investment in Hortonworks, you've got a board seat, but even independent of that, not everybody you do have those kind of relationships with has that type of investment, but so can you talk about what that means from an engineering standpoint? Like you talk about reference architectures, but what actually goes on there? How does that start and how does it evolve? Yes, so, and Joseph hit on it a little bit. He mentioned the collaboration with HP and open source. They contributed back to open source for yarn labels. So that's a perfect example of deep relationships and partnerships that we actually have with our partner here. But the idea of bringing an enterprise ready cluster to market, we've incorporated a security framework around embracing Apache Ranger and the communities embrace Apache Ranger. There's a data governance initiative that we're very excited about and that's actually people process and tech to make sure that we can deliver on that enterprise ready strategy. But there's a lot of contribution coming in from major players and major partners and we're taking that and turning to the community and saying these priorities are good for everyone. Let's move them to the top. So it's a very exciting time in the engineering space where commercial and open source are truly driving the next wave of it. It's very exciting. How do you, Mark, figure out what part of HP you work with? Obviously you got a relationship with Joseph's organization but there's the software group, there's Vertica, there's HP Labs, there's Cloud Group, there's Moonshot. We're doing a lot with EG server, right? That's obviously primary. We're doing a lot with the software groups, Idle, Vertica. We've got some exciting things coming down the line. Soon, Vertica now can live inside of Hadoop cluster but we really are looking at what's most important to our customers and where can we spend our time there? So yeah, it is a challenge. There's so many. We're working with the Helion team. We're taking about being able to deploy seamlessly into Helion. So we're working with that team. We're working with the OneView team to talk about, okay, how about using Ambari RESTful APIs to feed OneView, et cetera, et cetera. So you're right, we do have to pick the most important thing for the market and that's where we work with Joseph and get them to help us. Right, I'll say that with Meg's message driving a solutions focus, right? And I think that's one of the values, right? We're, I know we're humorously talking about all the different groups but the opportunity that's there is that we do have Haven, that we do have a strong services organization, that we have a management capability in OneView, that we have a variety of different server compute platforms to leverage depending on the use case. And so what's happening now, I think, is now we're looking at ways where Hortonworks actually provides requirements to us on what is an ideal Hadoop platform and what does that look like and they're giving us requirements for our next gen servers and the other way around, right? We're trying to figure out where's the next project that HP can collaborate on, where our developer can contribute so we can drive the project forward and drive these vertical solutions and that's kind of some of the things we're looking at now. So where are we in terms of commercializing some of these solutions? I know at Barcelona we had Steve Tremac on where we were talking about the Moonshot solution. You guys got solutions, where are we in terms of actually turning it into something that customers can buy? So customers can buy this today. Everything that we've announced as part of this project available today, we've got a number of customers we've had in discussions, they've been in discussion with great customers asking very deep questions. We're still learning about what's important, what type of customer likes that. So if there are customers interested in this architecture today, that's available right now. And additionally, as Mark pointed out, we do have the traditional architectures as well, right? We understand some customers, especially ones coming into Hadoop new, they are fans of the DL380 and that is what we absolutely want you to use. As they start maturing, maybe there's something that's more purpose built, like the Apollo 4530. And as they mature even more, this whole asymmetrical model, and most of the customers we're talking to now about this are more mature in Hadoop. So we've got this entire spectrum of solutions. So there's no reason to wait, Dave, at all. What about the go to market? So it's interesting, HP, it's split into it's still two Fortune 50 companies. Yeah, that's right. It's hard to do that. And so you got this massive distribution channel, which sometimes is complex to negotiate, but then you got small companies like Cortenworks and some of your competitors, and they always say, well, we got 60 sales, we got 70 sales, but then you look at HP and it's like all that goes away. So when you go to market, who does what and how does that all work? It is, it's a very tight partnership. And the way that we look at it is, we know where the expertise lies on either side, right? We've got a significant history when it comes to the infrastructure side of things, not just servers, but storage and networking and software and services. There's a variety of things that HP sales teams bring to the table. And then we partner very tightly with Hortonworks to have that conversation. And the conversation rarely is a, let's go talk Hadoop conversation. It is the log aggregation conversation. It is the content delivery conversation. Digital channel or whatever it may be, risk or fraud. That's right, the 360 view of the customer, right? Those are the things that we're talking to our customers about. And it's great to see us actually in field together. It's a great thing to watch because you see two companies partnering very tightly on a very focused challenge for a customer and being very much in sync, right? Saying here's why we are proposing this particular platform and here's the configuration of the Apache distribution that you want to work with. It's a really beautiful thing. And by the way, it's certified. And it's certified. Do you have a North American bias today? I know that's like an evil word inside of HP, but you have a small company. I mean, I know your sales teams, they're animals, and I know, right, they're meat-eaters. I know that, but they're going to be stretched in, you know, flying around. So is it sort of a Petri dish in North America or are you sort of going global? So, Amia, I've been working with the CTO team and Amia from HP. We've been working with the big data teams at HP. We've been collaborating both in Europe and in the U.S. So the Amia team, and that spans over to APEC. So this is a global thing. This is a global thing. We're working as much as we can. And yes, we do know how to scale pretty well at Hortonworks. So we absolutely make sure that we get the message out quickly, but we get it out on the right channel. And then we're there for anyone that needs more tech depth when we get into a real customer situation. So that's the real thing. Last year, we had a deal go down a major healthcare provider, 1,000 nodes by the end of the year. It started at 165 nodes. That was a traditional architecture, but very, very exciting. And it was in HP and Hortonworks' exciting win at a major player. So those are the kind of things that I've been getting involved in with and it seems to get a lot of excitement around Hortonworks. And this is where the global scale of HP comes in really handy, right? Definitely are embracing all the regions. We've certainly got experts in every region. And I'll say that this very dynamic and organic relationship that we have with Hortonworks, we go where the customers are. And Hortonworks has been great where if we find customers in Asia, find customers in Europe, where the opportunity is, Hortonworks steps up and join us. I can't tell. If there's a global scale problem with Hortonworks, I haven't been able to see it yet. Well, it's interesting you go where the customers are. So I mean, well, when we used to go to Hadoop World five years ago, it was all data scientists and geeks talking pig and scoop and hive and other stuff that nobody ever heard of. And now when you go to Hadoop Summit, you'll see this next week will be at Hadoop Summit. It's predominantly mainstream IT people. Everybody I talk to has some kind of Hadoop project going on. I'm sure you do. And the questions that are coming in are really from the business end users. And they may not even say Hadoop, right? They're just trying to figure out how they service their customers. And then it ends up being a Hadoop conversation right after. We touched on a little bit. You guys mentioned the data governance. Lot of small big data projects sort of spinning up line of business led, little shadow IT. No governance, right? And when organizations maybe don't have a chief data officer. Are you seeing that? I mean, how are you addressing that problem? Yeah. Is that a kill me with that problem? No, and I think what the enterprises are starting to find is that even with a data scientist or a community of data scientists, it's hard to scale that model, right? And so a lot of times they're coming with problems. They probably have people on staff. But really more and more, and I think it's the value that vendors, like the two of us provide, is that you don't have to do it on your own, right? All the things that if you decide to take on a Hadoop project on your own, you're going to inevitably have some trial and error. You're going to have some things that work. You're going to find some things that don't work. Guess what? We've already done that. And we already know what most of that solution should look like. So the conversation now is, we are trying to figure out how we get media distributed. How do we find predictive analytics on where police officers should be stationed at times of night? Those conversations are where they start at the CIO or the business level. And then it works its way down to where the IT folks join China and say, Hadoop would be a great way to solve this problem. Well, it's interesting. We talk about scale. Hadoop is all about scale. Absolutely. But interestingly, we were the first to quantify the market and we noticed that the predominant revenue generator in the Hadoop space and the big data space was services because it's so complicated. So there's a dissonance there. So the challenge to the vendor community, the broader tech community is, in order to scale, software's got to do better, right? So we have to, from a services standpoint, teach people to fish, get apps. Right now we're getting the apps layer and start developing those apps so that we can scale. We think that the software services mix is going to flip. That we've just done our 10-year forecast and we show actually software becoming the predominant component of Hadoop. A lot of hardware still. Sure. You always need hardware. But software, because it's all open source, has been a smaller contributor, but there's real value in software that can help scale. That's right. And some of those are tools, right? And that software is, it's a broad range of things that fall into Hadoop. Yeah, absolutely. Yeah, and I'm thinking of, it is the year of simplifying Hadoop. Not only from a consumer perspective. I'm a consumer, I'm going to consume Hadoop. This has got to be simple. So even from a deployment perspective, how do we simplify that? And of course we embrace Apache and Bari and there's a number of other distros that are embracing Apache and Bari. There's a big announcement. All but two. There you go. So it's about simplifying the operations. It's about making it enterprise ready and it's about making it easy for the end user. That's what this year's about. That's the name of the game. So that's what we've been trying to do. All right, we got to go. Joseph, Mark, thanks very much for coming to theCUBE. Great stuff, really appreciate it. Thanks a lot. Thank you. All right, keep it right there, but we'll be back. This is theCUBE, we're live from HP Discover. Be right back.