 Hadoop Summit 2012, I'm John Furrier with SiliconANGLE.com I'm my co-host Jeff Kelly, Omar Tremend, Vice President of Solutions at Cloudera. Welcome back to theCUBE, you've been on before. Thank you very much. It's great to be here. I think you were on the recruiting video, we did a lot of videos in the office. It's really exciting, you guys have changed the world. I mean, I talked to a member of my first interview at your Palo Alto when you guys just moved there with Amur around Hadoop and he said, quote, I saw the future at Yahoo and wanted to change the world and wanted to bring it to the whole world. And if you look at how prolific that was at the time, and I remember when Amur was an EIR at Amur the co-founder of Cloudera, the EIR at Excel, I was just coming off my last venture and we were running into each other and he was kind of giving me a teeth and he was so good, he wouldn't even tell me. I'm like, come on, just get me, tell me what you're working on, I'm working on the same thing. So when he launched, it was like, wow, that's cool. But really, about a year later, the world woke up to it when you guys put on Hadoop World in New York City in 2010, we gave it to Cube, there was our first Hadoop Oriented Cube. A lot has changed, so go back and talk about the evolution of Cloudera just from 2009 where it is today. Because Hortonworks is shipping their product for the first time tomorrow. It's going to be available publicly tomorrow. And that's great, they're starting tomorrow. Let's go back to when you guys were starting. Go back four years, yeah. When you guys shipped your first. It's been quite amazing. When you shipped your first. We shipped CDH-1, I believe it was early 2009. This is actually before I joined Cloudera, so I joined early 2010, just as we were getting ready to push CDH-2 out the door and start our, thank you, our CDH-3 process. 18 months ago, we shipped CDH-3 and that was a huge breakthrough, especially from an enterprise adoption perspective. It was the first time you had Hadoop security. It was the first time you had really a complete stack that you could drop in in terms of an enterprise IT environment and things would actually work. What we noticed, and you saw this around Hadoop world, is that when you kind of move out of the web space where people are fine adopting new open source technology and kind of integrating it out kind of in the rest of the world, they need things that work with the rest of their data management infrastructure. And so that was a tremendous success. And then just a few weeks ago, we actually upped the ante and shipped CDH-4. And this is now the first time that you have high availability, you have federation in the system, you have co-processed as an H base, and then we shipped both the stable reliable MR-1 and we also shipped MR-2. So our customers are actually now rolling out in production on CDH-4 and they get the benefit of experimenting with the next step forward and trying to figure out how to move different kinds of frameworks and then solve different kinds of problems on the stack. And then we get into the whole CDH-4 conversation. Jeffrey Moore was on stage, wrote Cross-In-the-Casm. And it's funny, he's talking to the industry now because you guys helped create, or created the industry, an all new industry, not just the computer, the data industry. But he's being kind of generic. He said it's an early market. And the things he talked about, domain expertise, use cases, you guys had that two years ago. So what I'd like you to do is talk about, and you guys are being humble on your website, Jeff just did a post about the big data landscape and your reference customers are a group on and I think a couple other ones on there. But I know for a fact, I know for a fact that being in your office, that you've got some other huge customers, like four letter, three letter agencies and others. So you know, the federal, financial and big markets and hyper scale. The theme here is enterprise ready. And you're out pounding the pavement. You guys have seen those use cases. So talk about the enterprise ready experience that you guys have had over the past 24 months. Yeah, Mike Olson, our CEO, has a famous script he's had since day one since I joined Clutter, certainly, which is the way you ship reliable enterprise software is you ship unreliable enterprise software and then you fix it. And we've kind of had four years of experience doing that. So going back to when we actually released CDH3 when enterprise really started adopting this en masse. And we were seeing, as you pointed out, financial services, extensive use within the government, both here and elsewhere, extensive use across the telco industry. And I think, fascinatingly in telco, traditionally think of big data is kind of on the backend trying to figure out and optimize the networks. But telcos today also have extensive service layer infrastructure. You go and you flip through your cable box at home. They're not just delivering bits over the wire, they're delivering higher value content. And all of that needs to be optimized, needs to be delivered in terms of recommendations and predictive analytics and how people are going to interact with those services. That use of Hadoop extends throughout the telecommunications industry there. And the more interesting things that we're seeing now is breaking out of the traditional kind of tech leading data centric applications into companies like Explorers, which is trying to improve health care, health quality and reduced costs. They're looking at, or they use HBase and are looking at data that's across clinical care, as well as billing data, electronic medical records, and pulling that together to help figure out that if you're in an emergency room and if you have a certain rate of follow-up care, that's going to improve your overall recovery time and reduce overall costs. So they plumb that back into the system. And that's just fascinating uses of really how to impact the world at large outside of the early adopters. Talk about the dynamic, there's a lot of insiders that are watching, as well as a lot of people who are new to the whole, not new to open source, but new to the whole Cloudera, Hortonworks dynamic, because Hortonworks is, we're 100% open source and we'll make money on services support and training and we'll do some biz dev deals. Well, that's what you guys do too, right? So you guys do 100, you're 100% open source, but you also have CD8. So, Mike Olson's been on theCUBE, so is Amhert, and every time they bang their fist and say, we are 100% open source. And oh, by the way, we have CDH. So explain the dynamic and how that's used, because you guys are very committed to the community. It's pretty obvious. You commit to Apache. Yeah, I mean, straight up, so our platform team who builds Apache Hadoop, they build on Apache Hadoop. When they do a check-in, they check in to Apache Hadoop. We don't have a separate repository. We build Hadoop. Now what we package and distribute is pulling everything from Apache, testing it so that it works, making sure it's integrated. So when you digest information via flume and then you put it into HBase, you run some processing in Pig, you publish it out via Hive, you scoop it via high performance connectors to Teradata or Netiza, that everything works end to end, right? It uses the same file types, it uses the same compression codecs, all the schemas translate and work. That's what it means to package a full distribution. And we do that using entirely open-source software. That's CDH. That is CDH. So what you're saying, I want to get this packaged up into a nice little box. CDH is all open-source stuff that you guys pulled. Exactly. Not proprietary code. CDH is Apache Hadoop. That you guys just test. Exactly. And how's that different from what Hortonworks is doing? I think we've had four years of experience testing and shipping. And it seems trivial. At first blush, I talked to a guy recently who was very interested in getting to the codecs. He said, you know what, I'm going to go and just download the tar ball and try and build everything. And it took him a week, right? If you want to use CDH, it takes you five minutes. Why? Because we've actually invested all that time over four years building up the infrastructure to make sure that these dozen different components can build and ship and work well together. I'll say personally, we built an application that's Hadoop HBase schema layer, app layer, and UI layer all integrated in. All CDH. Without Cloudera, we would have taken it six months. So, you know. And we give it all away for free. Yeah. Now, what do you charge for then? Okay, so. Well, that's important to us. The platform has to be 100% open source. That's a core 10 in Cloudera. Yeah, that is an absolute statement. You guys are saying Cloudera is saying 100% open source. So now we move on to Cloudera manager. Right. So how does that play? So we are a business, right? That is true. We like to make money. We need to be able to fund the development. You're not a nonprofit. We're not a nonprofit. That's a patchy. And so what we heard very early on before we introduced any additional software is that in terms of management operations, deployment, configuration, monitoring the system, that there were challenges out there. And there are solutions. People do stitch together open source packages in order to accomplish that. But if you're managing a mission critical system, right, so one of our customers, for example, what they're actually doing is stitching together satellite imagery in order to provide commercial applications where you can actually tell during the day how crops are performing or how different stores are performing, just kind of counting the cars in the parking lot. That kind of application, if their customers are depending on that in order to run their business, then the system is mission critical, right? It actually drives day-to-day revenue. So in order to do that, you actually need very sophisticated management operations. And we have a lot of experience doing that. We've basically built that Hadoop intelligence into software, right? So this is experience that we've built up over the years in actually running enterprise Hadoop systems. And we do that in terms of a piece of software called Clutter Manager. We package it with global 24 by seven dedicated production support. So we have engineers, contributors, committers who do nothing but provide customer operations backing to our customers around the globe. And that all gets offered as a subscription. And this is a model that Red Hat pioneered and did very well. So talk about, you know, we've heard a lot about, over the last day and a half, about integrating Hadoop into your kind of current enterprise environment, kind of leverage existing systems. You've already invested a lot of money. Let's talk about taking that a little bit deeper and talking about integrating Clutter Manager and the management and monitoring capabilities into your infrastructure so that, you know, it's not, you know, here we're monitoring our IT infrastructure here and then we've got our Clutter Manager over here. How do you do that? How do you bring that into the enterprise IT monitoring management paradigm? So one of the things that we did with Clutter Manager is spend considerable time with our customers IT departments. We said, what kind of tools do you have and how do you need to get access to this data? From a data center operations perspective, how do you manage the high SLAs that you now have on your mission critical Clutter clusters? And so what we did is built in both API level integration so you can actually run it completely headless. You get full REST API so that you can control the entire system. You get all of the Hadoop intelligence backing that and you get the unified monitoring and alerting out of that system. And then on top of that, we also built in features like LDAP integration. So if you have already an LDAP server or an Active Directory server, you have authorized administrators for your system. You point Clutter Manager at that and they're allowed to now manage your Hadoop cluster. You don't have to have, you know, management of separate administrators just for your Hadoop cluster. So it's those kinds of very low level tight integrations that are exactly what are into the systems are exactly what our customers use that we've actually built into Clutter Manager. Do you charging basically the support on a subscription basis? So we, we're, so what our goal or at the end goal is how do we help our customers profit? How do they benefit? How do their customers benefit? And we look at the platform itself as the foundational capabilities and what we charge for is how do we actually make them successful? And we make them successful via support. We make them successful via software that we deliver on site. We have a very rich knowledge base that they have access to. We have extensive solution guides that they can actually use to deploy, for example, a run book to onboard a new administrator. And of course we complement that. Clutter University is where we've taught over 12,000 people. We're teaching about 1,500 people a month. Sarah's been doing an amazing job of training. Sarah's been doing an incredible job at Clutter University. And then we also enable our partners. We have over 250 partners who, when we go and walk in and talk to our customers about what they need, we have a partner across the table who we've already worked with and integrated the software so they know what's going to work when they deploy it. Well, you guys clearly are number one. Hortonworks is running as fast as they can to get into number two. We're kind of parsing a little bit to strategies, but ultimately it's the same strategy. Get Apache stable. Their version of support and training is just a little bit different. You have training. You've been doing training till the cows come home since Sarah came on board, right? But the support model is just a little bit different. Clutter Manager, that's it, right? I think it's the benefit of experience. We've been doing this for a bit longer. Today, about half the Fortune 50 are running Clutter. That teaches you a lot about how to build and ship reliable mission-critical software. Yeah, I think I said before, there are not too many version four platforms in this big data landscape right now. I believe there's one right now. And if I heard from you correctly, it's just a lock on number one today and everyone's competing for number two. Not taking words out of your mouth. But it's still a very competitive landscape. Well, I mean, but you know, they could, I mean, you don't know how the market can stand. The market's growing. We absolutely have to focus on where our customers are investing. And we need to make sure that we're meeting what they need. And I'd say keep an eye out for CDHR. I'm impressed with Hortonworks movements. Really, they dial down the rhetoric, anti-Cloud era rhetoric from a year ago. They got their cows in order. They had some good people over there. They're doing some good work. And so, I mean, that's clearly always been my view on their strategy. They got to get to a close number two because the valuations just, you guys, your valuations very high right now and you guys doing extremely well. So congratulations. My question, Omar, is a little bit more now to the use cases. So you guys got a lot under your belt. I know in being staying close to your career, you're out on the road a lot, talking to customers. You know a lot about the marketplace from we've talked about some of the things that HP's done with acquisitions and big data. The world on the outside of the ECO-Dubico system is complex. So there's the unstructured, structured environment. When you go out to your customer base, what are the kinds of deployments you guys are now working on? I mean, yeah, proof of concepts, we're moving, you guys are moved out of that phase. There's more production. We heard from Hortonworks that Yahoo's doing some H-based stuff in the milliseconds with Messenger. That's a precursor to what we think others will do. What is your, what are you seeing out there in terms of meaty deployments, meaty solutions? I mean, we're seeing fascinating stuff across the board. I'll touch actually a little bit more in terms of the telco services. So Navtech, a group within Nokia, provides very rich map data and point POI data. They've actually been running that on H-Base on Clutter for a few years now. So that's a production application that touches all of us day to day. If you look at Mozilla using a Firefox browser, and occasionally it crashes different plugins, different complexities, and they want to keep the bar up in terms of quality. Every time it crashes, that goes into a Clutter system so that they can analyze it and improve the product. We actually had at our launch, one of our great customers company called Opower. And what they're actually focused on is trying to gather all the smart meter data in real time, pull all that data in, load it up into Hadoop, load it up into H-Base, and then be able to inform both the utilities in terms of how they respond to different dynamics and demand, as well as inform the consumer so that they can get smarter about their energies. So these are some incredible updates. What's the coolest thing you've seen over the past 12 months? I can't tell you that. In fact, I can't tell you that in very many ways. Okay, so it's government related, so we know that. Omar, thanks for coming on the queue. We appreciate your commentary. Just one final sound bite for the folks out there who want to know about what's going on at Cloudera, and some of the successes you had. You want to just open mic directly to the audience to tell them kind of what's going on at Cloudera. Oh, absolutely. I mean, we're thrilled about the recent release about CDH-4, Cloudera Manager 4, Cloudera Enterprise's new version. I think it's something you should all take a look at and try. It's got some amazing new features in it. Okay, we'll be right back with our next guest from Cloudera, Eli, who Eli Collins, he runs the releases over there. We're going to find out how long it's going to be until CDH-5, which is I hear he's working on. Found that out last night. So we'll be right back with Eli right after the short break.