 Back to theCUBE. Thank you, John. Good to be here. So, HBase is taking the world by storm. I had a chance to talk with a console plot and there was a co-founder of SAP last week at Sapphire and they introduced Han. I asked him, what are you guys doing with HBase? He kind of didn't really have a clue what was going on at HBase. He says, oh, my guys are on that. But that's really kind of the role here. HBase is quickly emerging in the open source community as the database on top of Hadoop and applications are running, operations are running. So tell the folks out there what HBase is here. You kicked off the conference. This is a technical conference. Explain to the folks what's HBase conference. Well, be glad to do that. In Ahasso's defense, we think that Hadoop, generally, and HBase in particular, really are great compliments to what he's built with HANA and frankly what we've seen in the relational database industry produce in terms of online transaction processing systems four years in a year. So absolutely confirm that there's broad interest in integrating with HBase among the established vendors. You know, we've been big fans of Hadoop for four years since inception of the business. Actually, my co-founders go back earlier than that. They were among the people who created the platform back in the consumer internet. Hadoop always got banged around a little bit though because it didn't provide sort of interactive speed, consistent data delivery. You want fast record storage and retrieval if you're doing big data management. Hadoop is very good at relatively high latency but also very deep analytics on data. Sometimes when you've done that analysis, you just want to fetch the results really quick. HBase makes that possible. It runs on top of Hadoop. So it takes advantage of all the scale out that Hadoop offers, lives on top of HDFS. So data that you put in HBase is available for analysis using the full force of MapReduce. But it's able to deliver records to users really at web speed and it's scaled up to enormous deployments. Facebook in the opening keynote today announced that their entire messaging system runs on top of HBase. Billions of messages per day flow into and out of that system. Absolutely astonishing. I found the Facebook presentation to be quite informative and relevant and quite surprising that they were sharing with the community their best practices and also specific use cases of how they roll out applications on top of HBase. Well, it's one of the unfair advantages that open source projects like Hadoop, like HBase have, the users actually contribute to the platform. So Facebook is not merely a big consumer of the technology but also a pretty significant contributor. The HBase community spans a whole bunch of different organizations. StumbleUpon, Paterra certainly is in there, many, many others. But it's able to be driven forward by the users with the specific problems, like Facebook, who know what their workloads are and what services they need. It's grown very, very quickly. In fact, one of the things I said in my keynote is from a standing start a couple of years ago when we first noticed that HBase was getting adopted by Paterra customers, it's become the way that our largest customers use Hadoop and largest in two senses. The very largest data volumes, typically, centered in HBase these days. And also, the largest investment in the platform. Also, a good signing for a big HBase. Well, you know, we've been playing with HBase with our little data project, SiliconANGLE, and we would not have been able to do the things that we're doing without HBase, but it's absolutely true. Within the AlphaGeek circles, HBase is quickly rising to be that solution. But what you were just saying earlier about Hadoop, so you guys had a different perspective. Hadoop was being adopted by the web-scale companies like Yahoo, and Amar once said on theCUBE, they saw the future and Paterra was then built. It was great for Batch. But with all the interest on real-time business, with analytics being a focus of a lot of the business and big data, Insights is really around real-time. What's your view of how real-time is impacting the Hadoop and Hadoop ecosystem in HBase in particular? Well, you know, you make a good point. I made it glancing late earlier. People have knocked Hadoop a lot because it is this Batchman platform, right? But in fact, that's not a fundamental limitation of the platform. That's just a missing feature. It's a bug, right? And over time, you should expect the community, broadly, to fill in those holes, to make the platform more useful and more considerable. Real-time data delivery, where you need large numbers of people to be able to get access to individual records at interactive or better speeds, absolutely is growing in importance in our installments. Pairing the powerful analytics, the powerful data processing power, with the ability to deliver results to an enormous number of people at WebStone, going to be transformative, I absolutely agree. Let's talk a little bit about Cloudera at the first Hadoop world in New York. You had NTT on, and I know as you've done a business deal since with them in Japan. What's going on with the company right now? Because the market's growing in a very large way right now, not just within the small industry that was once Hadoop. You had Hortonworks came in and there was a little bit of Hortonworks Cloudera, but now the market's growing so big. Talk about the market that you're in and some of the successes you've had over the past year. Well, I'm absolutely glad to hear that and appreciate the chance. You know, when we began Cloudera, we were the only voice in the wilderness. No one else embraced it. Nobody else was talking about the importance of big data to enterprise. Fast forward four years, where we are now. Sure, Hortonworks is active in the market, and I will say doing great work along with us and many others on driving that Hadoop platform forward. There's more investment from venture capitalists and startups like ours. There is more direct participation by the big vendors. So IBM with big insights, Oracle with their big data clients, EMC with their offerings as well. You're seeing a real focus on big data as a business problem, not merely by wild-eyed visionaries and Silicon Valley, but by some of the biggest enterprise companies, some of the biggest business-focused vendors on the planet. And that's because customers are demanding the platform. They recognize that they have big data problems and they need a big data solution. We're seeing growth on every metro. Total size of the clusters that our existing customers are running, pipeline of new deals that are flowing in, even the size of the deployments, the investment that our customers are making in that infrastructure all sharply up into the right. I would say right now that the market is growing much faster than competitive pressure is emerging. Certainly, our last year was tremendous and we're looking forward to an even bigger 2012 on the back of that road. Okay, well, congratulations. It's always been great to watch you guys grow and being the founders of this space in a commercial way and then work with the community to keep that balance of the force, if you will, has been fantastic. Let's talk about Hadoop World this past year. You talked about the big data fund. We're going to have Ping Lee on with Excel. Talk about more about that. But it was a real emphasis of almost a sigh of, not victory, but from your standpoint satisfaction. Like we've gotten to a point now where a platform and applications was the focus. What is the current update from your standpoint on this application focus? I know this is fun with Excel, that's separate, but what's going on in the market relative to applications on top of Hadoop and HBase? Well, it's exactly the right place to be paying attention right now. So for the initial couple of years that we were in business, we were trying to convince people that this platform was important. We talked a lot about features. We talked a lot about scale. We talked a lot about performance. I think that battle's won. Hadoop is the anointed winner in the big data platform space. The question really is, how do you use it? Big data matters, sure. What are you going to do with it? How are you going to get at it? How are you going to look at it? What kind of applications are going to make it available? We're seeing now, not just the big data fund investment that Excel and other DCs are making, but also large companies. I mentioned IBM a little while ago. It's big insights offering, is very powerful, very interesting. I was just last week, pardon me, at Informatica World in Las Vegas, in Las Vegas. Informatica has announced version 9.5 of PowerCenter PowerExchange, support Hadoop natively. So you can integrate data from Hadoop with all of your other infrastructure. That kind of integration, that kind of application support for big data, it makes the platform accessible to users who really never had access to big data before. And I think you're going to see more of that from the established vendors. Certainly you'll see this year, over the course of the summer, but especially at Strata Hadoop World in the fall, you'll see a lot of very exciting companies with new analytic tools launch and show off what they've been working on in the lab. Pretty good stuff. Talk about what's happening here at the HBASE conference, I'll say you guys seeded, invested the Hadoop world now, it's being run by O'Reilly. Again, you're taking the ball in your hands here by creating this technical conference. What's happening here on the ground inside the halls here? So I'll give you a couple of quick anecdotes and then I'll dive into the actual content. When we ran the first Hadoop World in New York City, had about 500 people show up. We decided that HBASE, just this one component of the stack that we distribute now, was getting interesting enough that we wanted to do a conference for it. We budgeted 500 heads. We got space for 500 people sold out weeks ago. We scrambled, we managed to work with the venue here. Add an additional 100 heads instantly sold out. 600 people showed up for what you could think of as a pretty deep, deep technical conference. I bet that we're gonna do this again and I bet that we're gonna see some pretty substantial growth just as we did with Hadoop World. Look, we concentrated not on vendor pitches but on real stories by real users of real use cases. Talking about what they're doing with HBASE, applications that they're running on top of it, they're experience operating the system in production. And for the developers, what we need to do next, what features are missing, what enhancements we ought to make. More than 100 submissions came in. I thought that the program committee did a great job narrowing that down, but even so, we had to add an additional track. There was so much great content. So the one problem with this event is that there's not gonna be kind of a suit all. It's exactly right. It's too much demand. That's great. So the system's exploding everywhere at the interest. The profile, and you have a history in the database market. You sold your last company to Oracle. So you've kind of seen this movie before in the database days. It's kind of different now. What is happening? Because you've got a lot of younger engineers coming in, computer science students. It's not your old school systems guys, although there's a lot of that going on. We talked about this last time we've met, but there's new school of computer science folks coming on into the industry here. What are you seeing as the profile of the makeup of the kinds of computer science and or developers in the space? And is it more software? Is it more DevOps? What's your kind of view on that given your experience? The line between software development and operating systems in production absolutely is blurring. And especially at some of the very biggest companies in the world, DevOps is a legitimate profession now. You'll be a programmer who, by the way, carries a pager, makes the systems run. Not everywhere, but the fact that we are now building software that can be operated by the people who wrote it means that the feedback loop, what features are missing, what do we need to do to make this more manageable, is much, much tighter than before. Systems are getting better faster than they ever had. And it's a good thing because the scale of the problem and the importance of the data analysis, getting your hands, getting your head wrapped around that data is more important than ever before. But I think that feedback loop being so tight, the fact that we're able to work in the open source community, take advantage of the innovative work of the entire planet rather than of a single company, all very, very promising. So a real important question I want to ask you is, do you have a computer science degree? I have two. I got a bachelor's and a master's, Go Bears from the University of California. You can prove that. I can prove that. I have a computer science degree as well. So we just want to get that on the record, make sure that we got the computer science credentials mailed down unlike some other CEOs out there to live in Valley these days. Okay, and real final question about the operating system, which is really a good thread. So with this DevOps movement, with roles like HBase, we have real-time analytics, real operational encoding, is an opportunity for developers and entrepreneurs. We have operating systems, it's complicated, and there's need for abstraction layers. What do you see as areas, if you were going to do a startup or if you want to talk to the entrepreneurs out there, where are the areas to really innovate with some space to be creative and executing? Well, I will say as I have said now for pretty close to a year, building applications that digest, that analyze, that process big data is where the money's going to be for the next five, six years. If you've got a great idea about how to apply machine learning or natural language processing or statistical analyses or other analytical techniques to big data, you're right on this platform through these APIs using HBase and other systems and abstracting them for users. I think you'd be able to make a bunch of money. I absolutely think that data expertise is going to matter and there's this new discipline called data science. It's a blend of programming chops, computer science ability, some knowledge of mathematics and statistics, but also just a deep understanding of the business problems that confront your organization. That's a profession where I think we need great people and where there are some pretty great sellers. Let me on a final question here. I know you got to run, I really appreciate your time. Let's talk about Facebook. Obviously the big IPO, Mark Zuckerberg got married, was actually walking around the block and I saw all the things going on with my kids. You know, one additional accomplishment not many people know about just over the past week and they also went to IPv6. So that's four things that Mark did in a week that were pretty remarkable. I mean, he's hacking Silicon Valley, he's hacking Wall Street, he's hacking the network. But Facebook really is one of those companies. It's a watershed moment. My commentary is a lot like Netscape and since the browser was this watershed moment. But yet it's Google-like in the sense that they have a lot of Google employees. Facebook is here presenting their ops, dev, dev ops and or Hadoop and HBase philosophy with the community. So they're openly sharing that. But what is it about Facebook that people don't understand? Because a lot of people are talking about Facebook, like they don't have a revenue model and we've been saying on theCUBE at Silicon Angle that it really is a data business. Can you share your insight, just your personal perspective on the folks out there and East Coast, maybe New York and trying to cross Facebook's possibilities? Well, you know, one thing I'll say is you look at the S1 filings, in fact, 3.7 billion in revenues, they do have a revenue model. The question I think some investors have asked is, does that revenue justify the valuation they've gotten? That decision's going to be made by pros in the market. I'm not an expert on that stuff. I'll say, I love the company. I use it all the time. I am an avid consumer of social networking services. So I post my updates, I keep in touch with my kids. I let my friends know what's going on. I enjoy being on a Facebook site. But if you're at Facebook, I don't think you think of yourself as a social networking company. I think you think of yourself as a data analysis and data capture company. And listening to Karjik's keynote today, talking in detail about how much information flows through HBase, more than a billion messages a day, billions and billions of operations per day, 250 terabytes of new data a month. Building the infrastructure to make that happen, a remarkable accomplishment. Michael Olson, the CEO of Cloudera. Thanks for coming on theCUBE. Appreciate it. Great to see you. We'll be right back after this break.