 Live from Las Vegas, Nevada, it's The Cube. Covering EMC World 2015. Brought to you by EMC, Brocade and VCE. Okay, welcome back everyone. We are live in Las Vegas with The Cube. Just looking at Ankle's flagship program. We go out to the events and they strike the signal noise. Live in Las Vegas with EMC World 2015. I'm John Furrier, my co's Dave Vellante. Our next guest is a great presentation, special guest. We got CJ Desai and the president and EMC's Emerging Technologies Division and Rob Beard, CEO of Hortonworks. Welcome to The Cube, guys. Thank you, welcome. Emerging Technologies is about big data. A lot of big data everywhere. A lot of hybrid cloud, Hortonworks, pioneering with Hadoop. What's the story? What's going on with you guys? What are you guys doing at the show? What are you guys talking about? So Rob, welcome as well. Thank you very much. We have had a good partnership with Rob and the team on engineering side for the past two years. Our Isilon platform, which is a scale out NAS, is fully certified on HDP and also most importantly the teams work very well together whether it's on engineering side or on go-to-market side. It's been a terrific partnership. And as the enterprise is really adopting Hadoop and it's important that we bring the emerging technology of Hadoop into the enterprise investment that's been made with Isilon and ensure that it's rock solid, that all the engineering work's done and can be managed on an enterprise basis. Rob, I got to ask you. Joe Tucci gave the keynote this morning. Love C and Tucci up there because he always talks about the waves. And that's all great stuff. We all know there's massive change going on. But one thing he really highlighted was no lock in customer choice. That's your world. What's your perspective in this world? Because a lot of different use cases with Hadoop and big data. And EMC has a huge customer base and you guys are emerging. Choice is a big thing at open source. How does that render itself to customer outcomes? And how are you guys dealing with this choice meets hardened platform? Well, you got to give them back. You got to give them open enterprise Hadoop. And that's what our model's built around. That's one of the core platforms of our strategy and that our goal has always been to do 100% of everything that we do first and foremost in and through the Apache Software Foundation and be a 100% open source. But to bring the enterprise rigor to the core platform. Not only in the innovation that we bring but how the securities dealt with. How the operational functions are dealt with. How the life cycle and data governance requirements are dealt with. But do it in a highly productized, packaged and releasing QA process that really ensures that that platform's rock solid, incredibly predictable and fits and meets the same enterprise rigor that they're already getting today and they're stored solutions with their strategic partner. So Hadoop is mashing together with the enterprise. We've seen that in the early days it was sort of Silicon Valley in New York doing a lot of Hadoop stuff. Now it's really come mainstream. When you think about the underlying infrastructure for Hadoop, a lot of people early on didn't even want to go after it because they didn't think there was any money there. Now I know the relationship predated your tenure CJ but what are you seeing in terms of adoption in the enterprise and why Isilon, why is it working? So I would say first of all, we do see when we talk to, as EMC has a large install base and when we go and ask them questions around hey, what's going on in your analytics space? That's how the conversation usually starts. What are you doing in analytics? Do you have any big data specific projects and specifically then going to Apache Hadoop and other types of solutions that are out there? So one thing that we consistently hear is that this is one area where IT feels that the demand is coming from a lot more from the business unit and the business unit will say whether it's you're in marketing or sales intelligence or any other types of functional areas they have made a decision to standardize and go with say for example Hortonworks and then they'll say CJ, we have decided to go with Hortonworks. We are a long time EMC customer. What products do you have to make sure that our Hortonworks deployment is successful? And then the answer is because our relationship with Rob and the team for the past two years we have done a lot of engineering work on Icelon. It took us a long time just to be fair. Many engineers on both sides burning a lot of midnight oil and Icelon as you are aware is scale out. It has security resiliency feature and the most important thing that customers like is you don't have to now create another stand alone infrastructure. I have all this data sitting in Icelon. I know it scales infinitely meaning very large and it has all the security features, replication industry class features and then let's make Hortonworks project successful and most importantly the questions they want to ask out of Hadoop data they can get it because they don't have to worry about infrastructure. So you guys Rob I've always made a big deal out of the depth of partnerships. You talk about that a lot with many of your partners. So you're talking engineering resources here. Somebody once said to me, John you'll love this right? The only two things that matter in companies are sales and engineering the rest of it is all BS but so kind of a side joke. But I wanted to ask you so you've got the engineering partnership that's early on and done. What about the go to market? How did ODP or did it change the sort of relationship there? The goal behind that is to be a consolidating factor and force so that the ecosystem in general can certify once and ensure that it has compliance across anything in the rest of the other distributions and we've already seen great evidence and support of that and that's the whole goal behind ODP and they're also assured that it's in 100% compliance with the Apache Software Foundation and it goes back to the choice and having an open platform and not having the lock in and by standardizing on ODP and any of the derivative distributions that have standardized entering compliance with ODP they're assured Apache Software Foundation and pure open compliance as well as the ability to get to the rest of the ecosystem just by certifying once and bring many together. So from a go to market standpoint you mentioned CJ that's the business lines driving us. EMC's not or Pivotal's not selling its own distro anymore because standardizing on Apache Hortonworks distro. So from a go to market standpoint where's that come from? Who's talking to the business guys? You got Salesforce, Rob. CJ obviously EMC is a huge Salesforce. Talk about how that all works. You know the great thing about Icelon team as you are fully aware, Icelon was extremely successful in many verticals and going to the business lines. If you look at the rules, say whether it's media entertainment, whether it's manufacturing, oil and gas and others. So Icelon, we have a specialized Salesforce for Icelon. Many people distributed around the world and they have been talking to the customers and the advantage here is the data is already on Icelon and now they want to run a Hadoop project on top. It makes just like much easier. So on my side, the specialty sellers and the SC force and these are many, many people as you know around the world understand the messaging. And once we said that HDP was certified on Icelon, we just have to unleash it to our Salesforce and they take care of the rest. So Icelon has always done that. And you know, we have one more product as you are aware which is Elastic Cloud Storage, which is the object platform completely homegrown within EMC in our division. And this can scale up to an exabyte, right? And you know with the object, many, many billions of files and all of that. And it has global namespace and others. So that would be the next project where we will work with Rob's team to ensure that whether it's Icelon, if the customer decides file or customer wants object, we are the best in class platform for them. And I think we are the only vendor who can actually do that for Hortonworks and Apache Azure. Very much agree. I know you just recently got here but EMC's been talking about this big survey they did around the information generation. And one of the things that popped up is one of the top four things that the information generation wants is real time. Spark, you know, the Duke 2.0 really has changed the nature of a Duke from batch to real time. Can you talk about that a little bit? And then CJ, I wonder if you'd comment and see talk about the impact in your customer base. We're a big fan and supporter and have deeply embraced Spark. I'll start there. And the whole goal behind when we started the company was to really innovate the core architecture of the Duke and transform it from being just a batch, single data set, single application platform to truly being able to do batch interactive and real time simultaneously on a central data architecture. And Yarn was the enabler to that, right? And that was really when we started the company what the first principle behind the company strategy was Yarn was the enabler to that. And it is then new tremendous engines like Spark have emerged. We want to embrace those and we want to be able to bring and enable those into Yarn certification and ensure that they're truly enterprise ready when we release it as a GA platform or part of a GA release. And we've done just that. And we've got multiple customers who've become very successful with those use cases driving those workloads through HTTP and Spark. And we're going to continue to invest heavily and deeply and ensuring that reference architecture continues to go on. So Yarn is essentially yet another resource negotiator to schedule that lots of drive stuff in more real time. And you're seeing that in your customer base as well? Yeah, so I would tell you that one of the things we are seeing a lot is real time analytics. And yes, I have this huge data lake, many puddles. You can name whatever you want. Data ocean. Yes, but I want to get the as much as possible data closer to my CPU complex so I can do real time. So let's take an example of fraud detection. I want fraud detection to be as real time as possible. Whereas my customer mining data could take a few hours and I'm okay with that. So that's why EMC's direction was investment in products like DSSD. So you have a large working set at a very high performance in microsecond latency to enable those kind of intelligence on top of what we are doing with Hadoop. CJ, talk about the Hortonworks relationship specifically around data lake. Cause you know, that's great marketing term. And to me it's slow, it's big. It kind of reminds me of the data warehouse market which is being disrupted by Hadoop and new technologies. For this new use cases near real time and certainly real time has to be for self driving cars, things of that nature where you can't miss a beat. So what are the, and there's a variety of different use cases. Can you guys share how you talk to customers because there's no one data lake. Certainly there's the old school, business intelligence data warehouses which is being disrupted, but new use cases with mobile and the cloud. What is the story? Is it flexibility? I mean, is data lake a just storage? How does the software fit in? How do you guys tell that story? It's not one use case, it's a lot of different. So I'm fascinated to hear the expert's answer but I'm going to take a shot at this. So I would say yes. I mean, when you talk to customers, data lake is a term that comes from our creative teams across the world, of all companies combined. Customers don't talk like that, right? They would come and tell us, we have all this data, do you support these protocols and can you make sure we can leverage our investment in this infrastructure. And at the end, there is some management component which these guys have done a phenomenal job in managing that data and then how you can provide security resiliency and then when we want to crunch numbers, can we do that fast enough? So when I look at, I was with a large insurance company that is evaluating Horton and others. They told me, CJ, we have lines of business come and said, we can wait for a few hours to get it. We want real time based on mobile responses and all that. And that's how they start the conversation. It's always the, what's the end game? And then it goes into, okay, I have data dispersed here, these are the protocols, and then it moves up the stack. Yeah, to your point, data lake's marketing term. What it really is, is an outline of an architecture though which says we can bring all our data together in a central place. But then we have to be able to then combine multiple different types of use cases and specific opportunities to go create value from that central data architecture. Now, where our partnership comes in is that we can leverage the power of Isilon, right? To manage that with the power of Hadoop, enterprise Hadoop, to create that reference architecture. And then we, with the power of Yarn, can then mix batch interactive and real time use cases simultaneously against that central data lake, right? And so that's where then we're spark real time opportunities or streaming internet of things type data sets into that and providing fraud detection in real time. We can mix all those with the power of Yarn on a Hadoop, on a HDP data lake leveraging Isilon as platform. So if you believe that use cases, I think everyone's in agreement that this is where it's going but now you add cloud which you have, which you own. If the customer has unlimited compute and Extreme IO continues on its torrid pace, and you mentioned Spark, you're about low latency data transit, if you will. You're like a better word, right? So that's not the data ocean or whatever you want to call stream, river, whatever. So software's critical. So Rob, what is the vision that you guys are working on to make the software really badass? To one, be compatible with the Stream IO high performance, be cloud ready with unlimited compute potential and has that fit into the EMC. Well, the great news is we've really delivered on a majority of that already together and that's where our two years of engineering efforts are really paying off. And we want to give the customer and the enterprise the ability for whatever use case they have to bring it under management across our reference architecture and then be able to deploy it across any deployment strategy whether it be on-prem, virtualized, or in a cloud. And we don't care. And we've already done the engineering work that the same exact bits from HDP can be simultaneously deployed across any of those architectures. Oh, by the way, on a Linux or Windows environment. And all of our engineering work and certification and optimization is already done and delivered today. Great relationship. What's next real quick? What's next in the relationship with EMC? You're seeing a lot of emergence of new data sets, fast-moving data strips, the world of streaming, the Internet of Things, and all of the use cases and value creation of those applications and data sets that come on to the reference architecture and we're enabling that to proceed quickly. Paul Merritt says the mainframe in the cloud, so the federation behind you. Congratulations. Great to have two great senior executives in the industry here in theCUBE. Appreciate it. CJ, president of EMC's Emerging Technology, Rob Beard, CEO of Hortonworks. Thanks for coming on theCUBE. We appreciate it. We'll be right back after this short break. Live in Vegas for EMC World 2015. I'm John Furrier with Dave Vellanti. We'll be right back.