 Okay, we're back live in New York City for Hadoop World 2011. I'm John Furrier, founder of SiliconANGLE.com, and we have a special walk-in guest, Amar Awadala, the VP of Engineering, co-founder of Cloudera, who's going to be on at 2.30 Eastern time on theCUBE to go more in depth. But since we saw him in the hallway, we had a quick spot, we wanted to grab him in here. This is theCUBE, our flagship telecast where we go out to the event, talk to the smartest people, and I'm here with my co-host. I'm Dave Vellante, Wikibon.org. Amar, welcome back, you're a longtime CUBE alum. So, appreciate you coming back on and doing a quick drive-by here. Thanks for the nice welcome. So, you know, we go talk to the smartest people in the room. You're one of the smartest guys that I know, and we've been friends for years, and it was your, my tweet, heard around the world by you, to find space, and we've been sharing the office space at Cloudera for almost a year now, and we're going to be trying to find space because you're expanding so fast, we have to get a new home. Sorry about that. I wanted to really thank you personally here on live, because you've enabled SiliconANGLE, Wikibon, to, we figured it out early because of you. I mean, we had our nose sniffing around the big data area before it was called Big Data, but when we met, talked, we've been tracking the social web, and really it's exploded in an amazing way, and I'm just really thankful because I've been, had a front row seat in the trenches with you guys, and it's been amazing, so I want to thank you. You're welcome. It was great to have you on board. And so, you've been evangelizing in the trenches with Yahoo, you were at EIR, at Excel Partners, announcing the $100 million fund, which is all great news today, but you've been the real spark at Cloudera as one of the co-founders. One of them, but one of the main sparks as a co-founder, a lot's changed. Jeff Hammerbacker, my co-founder from Facebook. I mean, we both, we said this before, like we saw the future. Like in our companies, we saw the future of where everybody's going to go next. And now Jeff's going to be on as well. He's now taking this whole data science thing to heart, building out a team, we're going to drill that down with him. What do you think about all this? I mean, like right now, how do you feel personally, emotionally, and looking at the marketplace? Share with us, you are. Yeah, I'm very emotional today, actually. Lots of good news, you heard about the funding news. Yes, it's $100 million for startups. But no, but the $40 million for Cloudera. Oh yeah, that was yesterday's moment. Yeah, yeah, yeah. Today's more money. Actually, the news was supposed to come out today. Came out a bit earlier, certainly. But yeah, I'm very, very emotional because of that. It's a very, it's a testament from a very big name and investors of how well we're doing and recognition of how big this wave really is. Also, the $100 million fund from Excel, that's also a huge testament. And hopefully lots of new innovations or startups will come out of that. So I'm very emotional about that, but also overwhelmed by the size of this event and how many people are really gravitating towards this technology, which just shows how much work we still have to do going forward. So it's very, very overwhelming. You guys have a great team. I've been scared, I've been scared. Michael is a great CEO, I'm on stage there. Great guy, we love Mike. Just really, he's geeky and he's pragmatic. He's a great strategist and you got Kirk, who's the operator. But he showed a slide up at his keynote that showed the evolution of Hadoop. Yes. The core Hadoop and then he showed year by year and now we got that columns extending and you got new components coming out. Take us through that progression. Just go back a few years and walk us through why is this going on so fast and what's the community doing and just, yeah. And what happened in 2008? Talk about that a little bit. 2008 was when we started, so I mean, first 2008 when we started nobody was believing us back then that hey, this thing is going to be big. Like we had the belief because we saw it happen firsthand, but many folks were dismissive and no, no, no, this big data thing is a fad and nobody will care about it and look and behold today it's obviously proving not to be the case. In terms of the maturity of the platform you're absolutely right. I mean, the slide that Mike showed that only 30% of the contributions happening today are in the Hadoop core layer and the overall kind of vision there is very similar to the operating system, right? Except what this really is, it's a data operating system, right? It's how to operate larger amounts of data in a big data center. So three, it's like an operating system for many machines as opposed to Linux which is an operating system for a single machine, right? So Hadoop when it came out, Hadoop is only the kernel. It's only the inner layers which if you look at any operating system like Windows or Linux and so on, the core functionality is two things, storing files and running applications on top of these files. That's what Windows does, that's what Linux does, that's what Hadoop does at the heart. But then to really get an operating system to work you need many ancillary components around it that really make it functional. You need libraries, you need applications, you need integration, IO devices, et cetera, et cetera. And that's really what's happening in the Hadoop world. So it started with the core OS layer which is Hadoop HDFS for storage, MapReduce for computation, but then now all of these other things are showing around that core kernel to really make it a fully functional, extensible data operating system. I wish we had a little replay button, but let's just put the pause on that because this is kind of an important point and for the folks out there, there's a lot of different analogs, people, metaphors they use in this business. So it's the Linux, I want to be, it's just like Red Hat, right? We kind of use that term. And the business model is, talk a little bit about, you just mentioned, you know, not like Linux. Just unpack that a little bit deeper for us. What's the difference? You mentioned Linux is, can you replay what you just said? That was really compelling. So I was actually talking about the similarity, or the similarity, and then I can talk about the difference. The similarity is the heart of Hadoop is a system for storing files, which is HDFS, and a system for running applications on top of these files, which is MapReduce. The heart of Linux is the same thing, a system for storing files, which is AXT4, and a system for scheduling applications on top of these files. That's the same heart of Windows and so on. The difference, though, so that's the similarity. Yes, I got it. The difference is, Linux is made to run on a single node, right? And Windows is made to run on a single node. Hadoop is really made to run on many, many nodes. So Hadoop is about taking a data center of servers, a rack of servers, or a data center of servers, and having them look like one big massive mainframe built out of commodity hardware that can store arbitrary amounts of data and run any type of component. Hence the new components, like the hives of the world. So now these new components coming up, like Hive, for example, Hive makes it easier to write queries for Hadoop. It's a SQL language for writing queries on top of Hadoop. So you don't have to go and write it in MapReduce, which we call the assembly language of Hadoop. So if you write it in MapReduce, you will get the most flexibility, you will get the most performance, but only if you know what you're doing. Very similar when you do machine code. If you do machine code assembly, you will be able to do anything, but you can also shoot yourself in the foot and crash the whole system, right? So same thing with MapReduce, right? When you use Hive, Hive abstracts it out for you, so you write SQL, and then Hive takes care of doing all of the plumbing work to get that compiled into MapReduce for you. So that's Hive. Edge Base, for example, is a very nice system that augments Hadoop, makes it low latency, and makes it support update and insert and delete transactions, which HDFS does not support out of the box. So it's more like a database, it's more like MySQL, and the analogy of MySQL to Linux is very similar to Edge Base to HDFS. And what's your take on it from your founder's hat on now on the business model similarities and differences with Red Hat? Yeah, so actually, they are different. I mean, the similarity stops at open source. We are both open source, right? In the sense that the core system is open source, it's available out there, you can look at the source code, you can validate it and so on. The difference is with Red Hat, Red Hat actually has a license on their bits. So there's the source code and then there's the bits. So when Red Hat compiles the source code into bits, these bits, you cannot deploy them without having a Red Hat license. With us, it's very different. We have the source code, which is Apache, it's all in Apache. We compile the source code into a bunch of bits, which is our distribution called CDH. These bits are 100% open source, 100% free. You can deploy them, use them, you don't have to pay us anything. The only reason why you would come back and pay us is for Cloudera Enterprise, which is really when you go operational, when you become operational and mission critical, Cloudera Enterprise gives you two things. First, it gives you a proprietary management suite that we built and it's very unique to us. Nobody in the market has anything close to what we have right now. That makes it easier for you to deploy, configure, monitor, provision, do capacity planning, security management, et cetera, for Hadoop. Nobody else has anything close to what we have right now for that management suite. That is unique to Cloudera and not part of Apache open source. Yes, it's not part of Apache open source. You only get that as a subscriber to Cloudera. We do have a free version of that that's available for download and it can run up to 50 nodes just for you to get up and running quickly. And it's really very simple. It has a very simple installer, like you should be able to go fire off that software and say, install Hadoop, these are where my servers and we'll take care of everything else for you. It's like having these installers, you know, when Windows came out in the beginning and you had this nice progress bar and you can install applications very easily, imagine that now for a cluster of servers, right? That's really what this is. The other reason why people subscribe to Cloudera Enterprise, in addition to getting this management suite, is getting our support services, right? And support is necessary for any software, even if it's free, even for hardware. If I give you a free airplane right now, just come and just give it, here you go, here is an airplane, right? You can run this airplane, make money from passengers. You still need somebody to maintain the airplane for you, right? You can still go and hire your own mechanics, maybe. We'd have a Tweet Up, Palmer. Ha, ha, ha, ha, ha, ha. You can hire your own mechanics to maintain that airplane, but we tell you, like if you subscribe with us as the mechanics for your airplane, the support you will get with us will be way better than anything else and the economics of it also will be way better than having your own stuff for doing the maintenance for that airplane. Okay, final question now, we've got one minute because we slid you in real quick. We're going to come back for folks, Palmer's going to come back at 2.30, so come back, that's Eastern time, and we'll have a more in-depth conversation. But just share with the folks watching, your view of what's going on in the Apache and, you know, there's all these kind of weird, you know, fud being thrown around, oh, that clutter is not this and that, and you guys clearly the leader, we talked with Kirk about that, and we don't need to go into that, but just share with us what's going on. I mean, what's the real deal happening with Apache, the code, and you'd have a unique offering? I mean, the real deal, and I advise people to go look at this blog post that our CEO wrote called, Mike Olson wrote called, The Community Effect, and the real deal is there's a very big, healthy community developing the source code for Hadoop, the core system, which is HDFS and MapReduce, and all the components around that core system. We at Cloudera employ a very large engineering organization, and in fact our engineering organization is bigger than many of these other companies in this space, just so our engineering is bigger. If you look at the whole company itself, it's much, much bigger than any of these other players. So we do a lot of contributions to the core system and to the projects around it. However, we are part of the community, and we're definitely doing this with the community. It's not just a Cloudera thing, for the core platform. So that's the real deal. All right, yeah, so here we are, Armour co-founder, congratulations, great funding, $100 million from Excel partners who invested in you guys, congratulations, you're part of the community, we all know that, just kind of clarifying that on the record, and you have a unique differentiator, management suite, and the enterprise stuff, good stuff. And the experience. Experience, yeah. I think a huge differentiation we have is we have been doing this for three years, ahead of everybody else. We have the experience across all the industries that matter. So when you come to us, we know how to do this in the finance industry, in the retail industry, in the health industry, in the government, so that's something also that's important. So just for the audience out there, I'm just coming back at 2.30, we're going to go deeper into these, the highly decorated Armour, I would check this out. Is the general at Cloudera. Is there any color that it doesn't have? Thanks very much. There's more actually. He's in the uniform too, the Cloudera logo on his shirt. There you go. Expecting some of those for us too, someday. Great, great to see you again. Love Armour, great friend.