 Live from New York, it's theCUBE covering Big Data NYC 2015. Brought to you by Hortonworks, IBM, EMC, and Pivotal. Okay, welcome back everyone. We are live in New York City. This is theCUBE's Silicon Angles flagship program. We go out to the events and extract the signal and noise. I'm John Furrier, the founder of Silicon Angle. We are here as part of Big Data NYC in conjunction with Strata, Hadoop. We are live in New York City. One block from the Javits Center, and we're here with two special guests, Ryan Peterson, Chief Solution Stretch at EMC, and Tom Brailey, the CEO of Cloudera. Guys, welcome to theCUBE. Thanks, John. Great to see you. Could it be here, John? Could it be here with Ryan? Great to see you. Tom, Cloudera, obviously it's a show that's been starting the whole ecosystem. You know, we were there, President of Creation seven years ago, watching Cloudera. Even when they were in stealth a few years earlier, Armour was at Excel Partners, kind of doing sting, Mike comes together, Jeff Hammerbock, all these guys came together at the time. Who the hell is Cloudera? What is Hadoop? You know, InstaGray was an early investor, I was talking with him yesterday, and we were talking about what was on the mind of those investors early on, because right now we're at that same tipping point where Hadoop was not known, became known, the hype cycle, now it's here, the show is all about value right now. So we're hearing that message. I want to get your take on theCUBE here. What does that theme mean for you guys, and what is the key message that you guys are sharing and sending and hearing at the event? Well, thank you, John, for that history. And when they all show the history, I think the early founders of Cloudera had some great foresight when they set up Hadoop World, not as a single company event. It's an event we sponsor, but it's a community event. And to see the hundreds of companies here, and the most exciting one are all the new companies that are starting and emerging here, just shows how much innovation is being done by this broad community. And if it weren't for the community, we wouldn't be creating all this amazing value in such a short amount of time. At this event, we're celebrating Hadoop as 10 years old. 10 years old. There are a few industries that are having such an impact on businesses, whether it's through operational efficiency or delivering transformative applications on a platform at a scale, as we're seeing with Hadoop and the community. And it's because of all these different organizations coming together, contributing to this project that we're seeing this amazing value be delivered so quickly. It's interesting. You know, we've been here now, it's our 60 year with theCUBE, and every year it's Hadoop real, it's Hadoop real. Again, this year, is it real? But it's not only is it real and relevant, you're seeing an abstraction where values now being created outside of Hadoop because the ecosystem has changed significantly. So I got to get your take. We're here with EMC, a big whale, it's whale season here in the big data world we were seeing earlier in theCUBE. The ecosystem isn't just startups anymore, it has grown in the past 10 years, a historic rise in value creation, but it's just beginning, right? So I want you to put some color around that ecosystem growth and what that means to customers because customers want stuff faster. They want the technology faster, they want outcomes faster. Yeah, so I think, you know, someone asked me earlier, did Hadoop catch the market by surprise? That's not actually what happened. The demand for a solution such as ours has pulled this market because what we're seeing is not only the explosion of data that we're all familiar with, but enterprises are truly wanting to become information driven and they're wanting to get value out of all their data. And so you take a look at why our partnership with EMC is so important. For years, these large enterprises have been putting all their data into Isilon. And now they want to get value out of it. And that's where Hadoop comes in with the co-development work that we've done and the tests we've done, we're like saying, okay, let's bring computation to that Isilon storage. And what Isilon is, is it's the world's biggest data lake, right? And now we can get value out of it. Right, it's happening everywhere. Ryan, I want to ask you, we had EMC on earlier in the queue and Bill Schmarzo and Aide and O'Brien were talking about doing the big data. And the quote was, big data is a team sport. And, you know, because we always talk about sports metaphor, the ESPN of tech, the queue. But more importantly, the integration aspects of data. What's your take on that? Because that means you guys store a lot on your drives and everyone knows you guys have got great storage. You have a relationship with Cloud Air. Talk about that relationship and the level of relationship. And obviously it's a community, everyone, everything's transparent. So just lay out your relationship with Cloud Air and expand. Is it kind of a thin relationship, a deep relationship, is it development, joint sharing, is there IP? Can you just go into detail about the relationship? That is a great question, John. So we started this relationship a couple of years back and it was obviously a slow start, right? Can I figure out where do we play together and those things? And what was really impressive to me is Cloud Air didn't just say, hey, let's work with Icelon, let's figure out how to make it function, do joint testing, and let's stop there. Like most of our partnerships, frankly, are. It's just, you know, test and validate, see if it works. They went to the next level and said, you know what, we want to make it easy for our customers. And this was Cloud Air, Cloud Air's customers to be able to hit a button. That says, hey, we're using Icelon. And so instead of Cloud Air Manager, there's a button for Icelon. You hit the button and it just deploys and it's ready to go and you're done. So that simplicity of usage was, I think, really the most powerful thing that Cloud Air brought to the table. And so our integration is not just at a superficial level, it's a deep development, of code development, really, effort. And we want to continue to move that relationship forward, though. So, Tom, your take on team sport, because this talks to customer value and the word outcomes, insights that was out there last year, certainly insights, get insights out of the data. But when you start talking outcomes, that's code words for, I have a big, fat check I want to write. I want value. And I don't really care how things are assembled. You guys figure out your relationships. I just want it to work. Are you finding that that's the customer orientation, that's customer driven that way, or is the customer just saying, you figure it out, Cloud Air, EMC, you're our vendor for storage. Cloud Air is the big data guys. Figure it out. What's the customer narrative to that? Did they get that right? And what's the view there? We're in the midst of that trend. If you were at this event three or four years ago, we were all talking about these very technical projects. We were talking about how they stitched together and why they were faster or why they were lower cost. Today, and Mike Olson said it this morning, we're no longer talking about pig, scoop and flume. We're talking about business use cases driving real value. We're talking about churn reduction, anti-money laundering, fraud, saving lives by detecting sepsis in hospitals, the connected cars. In the discussion, and what Mike said is, Hadoop is fading into the background. And what's coming forward are these transformative applications changing industry, changing lives, saving lives. Protecting a day and we're just going on and on these powerful use cases, and those use cases now are the discussion we're having, not the technology underneath it. All right, let's talk about what the customer's view is. We heard also on theCUBE, Brian, and I want to get your thoughts on this. Creativity, creativity is driving. We saw on the keynotes the creativity of a weekend and some caffeine, and you got, guess my age, from Azure, Microsoft on stage, you know, all these kinds of, there's many examples. There are numerous to say, but basically creativity is unleashed. New use cases are popping up. Internet of things is around the corner. That's certainly being hyped up, but there are new use cases. So I got to ask you a question. Data is the competitive advantage. So there's two approaches. Do you hoard the data or do you open the data up? And so this is a big dilemma because the old world was data is power, and the new world, sharing is power. So how do customers resolve this? Or do they care? Is it algorithms that do it? What's the deal? You know, honestly, this is something I'm really passionate about. You struck that euphoric nerve in my head. I'm thinking about how excited this is. But yeah, I think, you know, big data allowed us to do a couple things. One is bring all that data to one place. But secondly, with tools like Clutter or Doop, start to score and classify that data to understand what you have, what security and permissions, what contains personal, identical information, PII, what things would break security controls, all these different things. And now you can get to the point where you say, you know what, that data is really valuable. But unfortunately, not to me, it's really valuable to a partner of mine. Now how do I share that information becomes, I think, the next big question, the next big step that we need to figure out. And there's a lot of things, governance and security and permissions, but. I love to share just one great use case. I'm very familiar with it. So I love the cybersecurity space because I used to run a cybersecurity company. But today, you can no longer have signatures in advance of attacks. You have to do anomaly detection, behavior analysis. If you see an anomaly, you respond to that. Well now, there's a great partner of ours called Threatstream where companies, when they detect an anomaly, they can share it to peer companies. And so I detected this and you might be affected by the same attack. And so no longer are the vendors creating these signatures, it's these anomalies and then they're sharing them. And that's a perfect example. We're seeing this with banks. So we have lots of conversations with banks and we say, hey, are you guys willing to share data? Cause it seems like something that's probably one of the most secure information we have. And you know what their answer is? Yes, to remove fraud from the environment. We would be happy to share information for that. So I got to throw up some concept out there. Just kind of made this up, but it was from another concept we talked earlier is, will there be an SLA on data? And let me caveat that. Meaning, let's just say you have data, you put it into a Hadoop, it's in a data lake and you run a report, it's a system of record. You pull it out, you get stuff a day later, you know, a week later, maybe a couple hours later. That's a different value because you store it in a low cost storage, your storage or a commodity open source, you move compute to the data. We've been there, that was four years ago. Now you have engagement data, you have credit card transactional data. Now you have systems of intelligence which Wikibon has been promoting heavily. We're machine learning and new stuff's happening. That are going to change the game on speed to notification value. Meaning, if I am a human being, I only have 100 milliseconds in my head to click a button. That's recommendation engines. We've seen that stuff. I mean, Jeff Hummerbacher said, the brightest minds are working on ad tech, which essentially click on an ad the right at the right time. But that's a human equation. When you get into the machine equation, you're talking about huge speed advances, more than 100 milliseconds that the humans can handle. Will that new systems of intelligence change the game on the value of the data? And will that impact the ecosystem and offer opportunities? So I think this goes right back to your creativity concept earlier. We can't imagine how our lives are going to change because data and insight can be delivered at the time of interaction. Whether that's our human interaction down to machines interacting and making decisions or altering things based on some insight. We can't think of them. Today we're doing things, someone swipes a credit card and then they get an SMS text saying, did you really want to make that purchase? It's odd. Before that purchase is completed. That is pretty powerful stuff. And we wouldn't have thought of that use case five years ago because the technology didn't exist. One thing I do know when you look at the roadmap and you look at the pace of innovation, one of the areas that's moving fastest is the speed of computation. We're moving things into memory and Intel's making these specific chips. Dreaming, all this stuff. Yeah, so I don't think the SLA is going to be how fast can we produce it. I think the technology is going to outpace our ability to think of those use cases. Hearing customers say, will there be an SLA for data, service level agreement for data? I mean, are they getting advanced enough? Are they still just kind of doing reps on basic stuff right now? Certainly depends on the customer. We have some customers who say, you know, we need SLA's because we're already there. Just talk to customers, talking about how they have an internal marketplace of their data assets. And in that case, the SLA started to come conversation. Other people are early on in their journey and they're just kind of developing where is their data and how they get it in and what are the componentaries needed. And those customers are less interested in what the SLA is and more how do they get value. And so it is sort of a journey, a process that these guys take to get there. One question I'd like to ask you, Tom, because Cloudera I think is the perfect poster child in this company that lives in this era of living in the future, but operating in the present. I mean, every interaction I've had going back, you know, 10 years ago with the Cloudera folks is, you're always, you guys are way ahead because you're looking at stuff with Cloudera that's really futuristic, but not operationalized. EMC is operationally relevant in every major enterprise. So this balance between living in the future and operating in the present as CEO, that must be challenging. We want to begin with culturally, but how do you handle that and how does a customer bring that innovation strategy to their world? Yeah, so I think, you know, this is exactly the role and value proposition we're supposed to play as one of the companies in this movement. It goes back to a lot of this innovation that originated in Google. And building Apache Hadoop and then building distributions and testing it in documentation and training is to help every other enterprise get there, right? And get there in the shortest amount of time. And so there's so much innovation happening here at Hadoop World. You look at all these new companies and one of our roles and our responsibility is to help curate, to help test, to help integrate, to make sure it interoperates. It's well tested and there's security that goes across all these different projects. And then bring that to enterprises so they get value out of it. Brian, I got to ask you, I heard Mike Olson's keynote that I always watch Mike's because it's always a tip to the future. You know, he's kind of telegraphs a little bit of his music, like a poker tell. And he's really talking about- I tell him not to do that. You know, but we know him too well. He's a great guy because he can't, he has to leave a little bread friend, Easter egg out there. Mike, if you're watching, we got your tell sign. But his tell sign clearly is the value is coming now. This is time, rubber's hitting the road, meat on the bone. Customers want value now. We've been now in the 10 year anniversary. So, hey, congratulations, pat us out in the back, but let's get decked out in the business. And the business is I have to deliver value because the customers now want stuff now. So that brings up the equation of, what is that innovation? How do you talk to customers and you say, hey, we got your, you just say, hey, we got your back. Don't worry about it. We're tight with CloudEra. What's the development plan with CloudEra? What are you guys doing to get your customers' backs? So first of all, I'll agree with that. I think that there's this concept like vertical business gravity of solutions. And the concept there is that you've got telecommunications organizations have been using big data for a while and each one has a different use case. Well, those use cases start to get shared amongst each other and all of a sudden now, hey, there's a big opportunity around solutionizing. Is that a word? I don't know, solutionizing. Opportunities around telco and then same in banking and the same starts to happen in healthcare. And I think that's a big, a big movement. We've been finding is that as customers have started to hit certain limitations of what Hadoop does today, like you talked about SLAs, they're asking for the next generation of technology. What, maybe it's a Kudu, maybe it's a new application that's yet to be built or maybe it's a new technology. And so one thing we've done is we've worked with CloudEra now to start building a new technology that EMC has been working on for quite some time. We can't say what it is yet, but look for it and look for it in a little bit in a few months. So you're writing code with CloudEra, joint software development. Joint software development. And the intent is to try and make the fastest possible Hadoop cluster on the planet to really solve some of the latency challenges that we've seen in the past with Hadoop, to go after new workloads, things like say Monte Carlo simulation over the top of Hadoop in a latency and performance application that you wouldn't normally see with a traditional architecture. So we're really trying to work with CloudEra to find that cutting edge balance. EMC, as you said earlier, we are playing the balance of how do we innovate and at the exact same time maintain that very stable control. And it's because we have this idea of emerging technologies group and we kind of push those boundaries. Tom, talk about your growth strategy. Okay, you guys haven't filed an IPO yet. It was summer speculating, you will. I've said on theCUBE you'll never sell because I know the founders don't want to sell. They want to build a durable company and be that next day. So that's my prediction. CloudEra will never sell. Someone on Facebook with the number can be high. Thank you for answering your question. I mean, IPO and build the sustainable company, that's the founder of the vision. You guys could have sold many times. I know that. But the reality is you still got to have a growth strategy. Evaluation is $4 billion. What about your ecosystem and your growth strategy relative to the product and ecosystem? Because with EMC and your other partners, you guys are doing a lot of joint development. You're doing a lot of ecosystem work on the biz dev side. Is that fundamental part of your growth strategy? What is the CloudEra growth strategy for the company? Yeah, so you answered it better than I could in your question. So we are going to be a long-term enduring business. And the market wants us to be a long-term enduring business that works for the ecosystem of partners that allows for best-of-breed decisions at every level of the stack. So what we realize is that we are a platform provider but our customers want solutions. And so that solution layer comes from our ecosystem of partners. And we work with them to deliver the complete solution. And the customers like to see the vendors come together and say, we've done the integration. That's the development worker. We're doing code development work to do the integration. Therefore, a customer gets value in a quicker amount of time. How many code development deals do you have right now going on with other partners? We have a number of code development deals. We have a very large organization called Partner Engineering that does certification of our integrations. I think our relationship with EMC is different than our broader set. This is a partnership we're very committed to. And so we want to do some very unique things. Yeah, storage is not dead. Obviously it's been over a decade, two decades, storage is supposed to be killed two years ago. But you know how the storage fabric, you got to store the data somewhere. We look at it very selfishly, right? We have no value if we don't have data. There's a lot of data in our partner here. And so we bring value to their data, which brings value to our customers. One of our first CUBE interviews with Joe Tucci, and I said, Joe, storage sexy, storage sexy. But Dave was laughing. But the reality was we saw then, storage isn't going away. There'll be different layers of storage and different intelligences. So that sounds going to be an integral part. Final question on the disruptive technologies. What do you guys see right now as a disruptor? Obviously Spark is the center of the conversation here at Strata Hadoop and Big Data NYC. So what is, besides Spark, what disruptive technologies do you see really taking us to the next level fast because the growth is significant across the board, not only for the companies you've seen, Cloud Air, but the entire ecosystem. Customers want to deploy now in production all the time. It's a challenging question for me, as I'll say that I look at all of these individual projects that come up with an Apache as somewhat disruptive to the previous version, the previous possible solution. But I don't look at them as hugely industry changing. I think it's more product Hadoop changing. So I really think the disruptive technologies, the things around, things like data sharing and data monetization, the application development frameworks that are pulled up now to take the content that we're creating from all this big data, the value, the insights and turning that into actionable data sets and then repeatable applications, to me that's the things that's really changing the business. Do you see the disruptive enabling technologies that you see that you're excited about? Well, so people like to think that Hadoop is all about HDFS and MapReduce. In the past few years, it has evolved so quickly, right? And the most recent example is around Spark and everyone's so excited about streaming and the machine learning capabilities. And there's going to be more and more innovation. And none of it, I think all that innovation makes this ecosystem of projects increasingly more disruptive and more capable. But one of the things that I think industry started to understand is that this community of projects has many years of legs to go. And we have a lot to do to make it much more stable, more manageable, and then new workloads keep coming on. And so, and Spark is the most recent example. We are fascinated with the uptake of Spark in our customer base. What's the Cloud Air position on Spark? What's your official statement on Spark? I know Mike addressed it in his keynote today. What is, are you going to continue investing in it, funding it, partnering with it? What is the Cloud Air role of Spark? Because that's a real hotbed right now. Well, our role is we're very committed to it. We're very committed to the project. We're the first to introduce it into a distribution. We believe it will be the standard in this set of capabilities. We announced our one platform initiative where we've taken all of our enterprise classings, whether it's systems management, data governance, security, things like we introduced Kudu today, we introduced Record Service and we built work automatically with Spark. So our one platform initiative says that Spark is going to play a front and center role. We believe that all future jobs won't be written in MapReduce, they'll be written in Spark and a lot of old jobs will still exist in MapReduce but everything new is going to move to Spark. All right, final question. Put the bumper sticker on Strata Hadoop this year. What's the bumper sticker on the car read this year? You have to summarize the show in one bumper sticker. We're finally talking about business value. Okay, business value. Here with the CEO of Cloudera and Ryan from EMC, Ryan Peterson, we'll be right back with more right after this short break. Live in New York City for Big Data NYC and Strata Hadoop, we'll be right back.