 From Las Vegas, it's theCUBE. Cover EMC World 2016, brought to you by EMC. Now your host, Dave Vellante. Welcome back to Las Vegas, everybody. This is EMC World 2016. This is theCUBE. This is our seventh year at EMC World. It all started in Boston with lobster and chowda and the socks. And now we're out here in beautiful Las, sunny Las Vegas. I'm here with Scott Andrews, the regional vice president for channels and alliances at Hortonworks and Keith Mante, the CTO of analytics at EMC's Isilon division gentlemen. Welcome to theCUBE. Let's talk big data. Let's talk. Thank you. So yeah, we were talking off camera, Scott. I was saying we were just at Hadoop Summit in Dublin. Great show, a lot of buzz coming off that. So Hortonworks, things are smoking. Yep, things continue along at a really fast pace. Partnership that we have with EMC is really starting to flourish. If you look back when we started the partnership, it's about two years ago now. In March of 2015, we were officially certified as a joint solution of Isilon and Hortonworks data platform. Fast forward to September, 2015. We're selected as part of EMC Select. So now we really start getting the ground game going from a sales perspective. And we're starting to see a lot of traction across the board from a customer standpoint perspective. So EMC's had a long history now in big data. When the whole Hadoop meme sort of hit 2009, 2010, it really started hitting the radar. And then you guys made an acquisition of Green Plum. And then of course, Isilon came in after that. And that was going to wheelhouse storage infrastructure, big data, boom. And so you've really evolved and become a major player. And the other thing is the conventional wisdom back then was, oh, EMC, NetApp, whomever picked your storage vendor, big trouble, nobody's going to use that stuff for Hadoop and big data. And then bang, hits the enterprise. But what about governance? What about security? What about compliance? So that's been a real tailwind for you guys. You talk governance, regulatory compliance. I also say operations and SLA. People are starting to build massive complexes. They want to make sure that they stay up. They want to make sure that when they put millions of dollars of investment of their business on a Hadoop platform, they're able to run it. They can get the staff to support it. They can be able to scale it. They can handle the availability they need. And to your point, that is really where we're seeing continued partnership with Hortonworks as driving the Hortonworks and Isilon value proposition for large organizations, as they continue to grow their Hadoop platform. So let's talk a little bit about that partnership. I've personally talked to Sean Conley many times about Hortonworks philosophy on partnering. And I think EMC, much, much bigger company, but Sean has always underscored the depth of the partnerships that you're looking for. It's not just what we sometimes call a Barney deal. I love you. You love me. Let's do a press release and we move on. It's much, much deeper than that. Can you describe a little bit of depth? Yeah, absolutely. So when we partner with a strategic partner, we go deep from engineering and product management so that, one, we've got the proper integration day one, but we also have the roadmap in place in terms of where we're going. We were talking about this a little bit earlier. We've seen it where we've sort of gone through the evolution of Hadoop within our customer base. And now we're going back through that evolution with the combination of EMC, Isilon, and Hortonworks data platform, where customers are saying, I did the POC and then I did the renovate in terms of ETL offload, but now I'm really looking to scale out and do the interesting innovation use cases. And I want the security. I want the operations. I want the management. I want the governance. And together, it's incumbent upon both organizations to drive that into the market. In turn, we need a real deep engineering partnership. So from EMC's perspective, again, you guys really hopped onto this trend and said, no, we're not going to shy away from, from big data. We're going to go all in. We're going to develop solutions. How do you guys collaborate from a, from both an engineering and a go-to-market standpoint to make that happen? A lot of it, you have to watch where the industry's going. What are the hot trends? I mean, we see trends here about IoT. We see trends here about video surveillance. We see trends talking about, you know, data offload. And as we watch where the trends are going, then we look at, all right, what are the workflows that we need? What are the tools that we need to enable those? And, you know, first thing is, you need to have a joint engineering. How does the data get into the platform? So platform, you know, so their NIFI investment, which was on YARA, now we label this, Hurtmark's Dataflow, makes a great way to get data inside of the ISOIN. You then also look at, you know, what are the use cases and what are the tools you're going to put on top of the environment? You know, so it, as we spend a lot of time, we start with the end user in mind and then work back to what are the tools, what are the joint engineering? How do we build together the partnership that we know where the client wants to be? Because it doesn't really help if you go to where the people are today. It really matters how do you go to where they're going to want to be in a year or two or three, because it can't just be about where the world is today. It's moving far too fast to be able to do that. So Scott, talk about that NIFI investment that you guys made. You're a public company now, so you've been more acquisitive. Talk about the importance of NIFI and what the angle is with EMC. Yeah, so it's incredibly important for us as an organization, right? Because what it does is it brings us the notion of a connected data platform. So the combination of Hortonworks data platform, data at rest, and Hortonworks data flow, which is essentially a NIFI technology for data in motion, right? And what it enables us to do is to be able to deliver data both securely but also with lineage from a governance perspective. So we can ensure you that the data that you selected to bring in is actually the data you wanted. And we're seeing this play across with EMC right now. I think Mike Bishop from, President was on earlier, and he was talking about leveraging Hortonworks data flow with Hortonworks data platform on top of IceLine. So we're seeing it out in the market right now. That was an interesting use case. Absolutely, risk mitigation. So, okay, so Keith, in the early days, again, to do batch platform, yarn comes along, allows us to negotiate more resources, do more things at once. Now, you've got Spark coming in, other in-memory technologies, real-time streaming. How is big data evolving, and what kind of efforts are you guys doing together to accelerate that? As you said, that's the classic use case. It's been around since 2009. Last year, what we've seen is just a complete uptake to streaming and what I'll call the hybrid architecture, which is using a lot of your batch stores to calculate amount of data that then you can then, at real-time site already, they have a problem or not. We're starting to see that in various things like cybersecurity or the press and edge scenario where they're starting to figure out, are we looking at someone's life in danger because there's a volatile situation where a traveler is. And so, where we're seeing the investment is going closer to real-time. You still need all the data that you have, the data may be cold, that may be a month's worth of data, a year's worth of data, or it may be 15 minutes, but you need to aggregate that up and then you need just a minuscule amount for that real-time. But that's what we're starting to see is, how do we take all that together to make meaningful real-time, at that exact moment, decisions? And that's really where we're seeing the world move into now. It's off a batch to instantaneous. But instantaneous means lots of compute in order to get there. So let's map that into some of the things that customers have been doing over the last several years. So you guys like to talk about Data Lakes, my business partner, John Furrier, hates Data Lakes, likes Data Oceans, whatever, we have to have that argument. So the concept of Data Lake, largely was about extending, maybe even reducing the cost of the traditional data warehouse, not maybe, clearly. We often joke though, ROI was reduction on investment. You know, it did cut costs and it allowed you to store a lot more data. Okay, well that was good, and that's still growing. But you can see that as not being the most exciting part of the future. We've talked about IoT, data in motion, more real-time. So talk about how some of the applications are evolving. Yeah, I think what we originally saw was people building sort of single-purpose build applications on top of the cluster. And then it moved more towards this notion of a Data Lake where with you are and you're now able to run best of breed, whether it's from the Hadoop ecosystem or commercial software on top of that Data Lake, I think that trend continues. But I think we also see purpose-built clusters being maintained for very specific workloads, right? Where it's how do I run best of breed software on top of this specific cluster to deliver the performance or outcomes that I need. Okay, and how does EMC approach? So for example, I mean, you get Pivotal, right? You guys are now sort of interesting bedfellows, Pivotal and Hortonworks, you know, ODPI, and so that's all cleared up. How does your relationship with Hortonworks, the Pivotal folks, Hortonworks, how are you guys all working together now? Can you sort of help us squint through that? Go ahead, I think ODPI, and as you mentioned, brings a whole lot of value. And so one of the things, I used to get a lot of questions and replies about where the pace of change within Hadoop was faster than what most companies could handle. Now with the ODPI release, Hortonworks 2.4, you're now going to see more of an annual release of Hadoop. And then the supplemental or the frothy components like your Spark, your Kafka, will change more often. And that'll cause clients to stabilize a little more. But it also allows us to build a better runway so that we know where the release is going to be. We will be able to go to market sooner with Hortonworks to have clarity, to have certification right after it goes to market. We will be able to put all of the pivotal components on top of it day one. And so the goal there is, with a lot of the ecosystem is changing, it was sort of a catch-up game. And so now as it becomes a little more synchronized, there's less of a catch-up game. And who wins is the client when day one, they can install everything and it works. Versus having to wait till all the components got there. And that was sort of the early days of Hadoop where you sort of had to hope that you had the right configuration to make sure it worked. Now it's very precise. You can install, you know, work with everything you have. And that goes to the partnership that we have and sort of, you know, how do we make it easier for the client? How about the ecosystem, Scott? Let's talk about that a little bit. Doug Cutting made the statement, I think it was on theCUBE or somewhere, but that basically every time a new project comes out, you know, they got to go support it, you know, they have good engineering resources on it. Same's true for Hortonworks. Your business model is different because you're essentially selling a subscription model, right? But the ecosystem is very complex. It's a complicated situation for a lot of customers. What do you make of that? It's not getting simpler. It just seems to be growing and growing. Could we expect that to continue and how do you deal with all that complexity? Yeah, I mean, I think you see it continue. Certainly when we put out our first release of the Hortonworks data platform, it was much smaller in terms of what it is today, right? And I think you'll continue to see that evolve. And the other piece of it is supporting the ecosystem, right? Hadoop is clearly a platform and it's incumbent upon us to make sure that all those solutions work optimally on top of the platform, right? Because that's what customers are driving for. They want to be able to run the right solution for the right workload. Right, so you guys, a lot of committers to the open source world. EMC's relatively new to open source. I remember when John Rose came on as the overall CTO at EMC, we went in and did a little analyst talk and he asked me, what do you think? And what advice do you give us? And I said, open source, you guys are way behind in open source. He said, you know, you're right, but that's going to change and already is changing. You just don't see it yet. Now, the fact was that the time EMC was a consumer of open source and then of course with Pivotal and others. But give us the update on sort of your role in the ecosystem as opposed to just a purveyor of storage solutions. And I think it's very telling. We have an entire booth dedicated to what's called EMC code, which is EMC's open source portfolio. Be it our, you know, scale out block, which is open source scale IO. We now have a Icelon free and frictionless software to find Icelon, which is scale out file. We also have the ECS, which is scale out object. So we continue to contribute. If you look at those ECS and Icelon are within the last year. So, you know, the pace continues. We continue to push more software to find. Doesn't mean we're not going to continue to sell the hardware to find it, but what we're seeing is the world's changing. Open source is there. It's heat, you know, and EMC is embracing it. Can you guys give us some visibility, you know, show a little leg on what you're working on now together? What's the roadmap look like? What kind of guidance can you give us? I would say one of the things I know we're seeing is watch where NIFI, ATLAS, HDP and Icelon go with some of the use cases. And when I say that, think scale. So today, if you think Kadoop, you know, if you hear a petabyte or two, you think that's huge. I think that is about to change at a radical pace. I would agree. I think the focus is going to be around IoT and how do we integrate important works data flow and what does that look like moving forward? It's been a lot of interesting discussion going on. In its early days, obviously, you know, everybody wants to instrument, you know, the windmill, so to speak, but there's connectivity issues that people are resolving and then there's obviously, you know, data movement that's going to be a really interesting explosion of innovation. You guys are at the heart of it. Well, listen, thanks very much for coming to theCUBE. Congratulations on the medium term. In this world of dog years, you guys have been partners for a long, long time. So wishing you the best for many dog years forward. So thanks again. Great, thanks for having us. All right, keep it right there, everybody. This is theCUBE, we're live from EMC World 2016 in Vegas. We're right back. Looking back at the history of death.