 Here with the two stars of the show, at least two of the many stars. Mike Olson, CEO of Cloudera, Ping Li, XL Partners. You guys are up on stage today. Big announcement, $100 million fund, very exciting. Congratulations to both of you on that and congratulations on the third Hadoop World. Yeah, you bet. Thanks very much. It's great to be on theCUBE and to have a chance to catch up with you guys again. It was great last year. This show, as you can tell, is a bigger deal by far. The number of companies, the number of people, the momentum, the energy in the room, just fabulous. We are thrilled. Mike, you were on stage giving a great keynote, a lot of good buzz on Twitter. We were watching it, taking some photos. You mentioned years ago, kind of did you walk down memory lane? You did kind of like where we've come from and what's going forward. You mentioned three years ago, speeds and feeds, past three years, getting all the code going and now it's kind of growing like crazy. And now going forward, it's about business and the tools and the platform. Can you just share quickly that view? Yeah, no, I think that's exactly right. You know, when the platform was brand new, evangelizing the platform was important. How does it work? Why is it different? What can it do for you? But we've seen adoption now, not just in the web space where Hadoop was born, but in financial services and Telco and big old guard brick and mortar retail in a bunch of vertical markets where business problems matter. The users we see rolling this software out now are focused on driving revenue, on understanding customers, on solving real mission critical problems for themselves. It's now about analytics. It's about insight you can garner from the data. It's no longer about is Hadoop cool? That question I think is settled. Hadoop's pretty cool. Hadoop is now transforming the way that people can operate their businesses and those applications are going to drive the next round of adoption. You guys just closed $40 million in new fresh financing. Congratulations. Thanks very much. We had the lead investor on Frank Artali here on theCUBE earlier right before you guys came on and he was so excited and he said, quote, I love the platform play of Cloudera and platforms are good business, and but ultimately you're judged by what sits on top of the platform. So I'd like to bring Lee into the conversation with you here because you guys announced on stage the big news this morning which was Excel partners is doing a hundred million dollar big data fund. And people are seeing these kinds of purpose-built funds. We saw our client at Perkins with the iPhone fund very successful and the iPad fund and the social fund that they got or what is called. So one, tell us one, is it because there's a whole new set of new ventures that are creating on top of the platform? And talk about the fund if you could, Mike if you could talk about the stuff that's going to be on this platform that you're seeing that's new venture creation oriented and you can then talk about the fund specifically. Yeah, I'll be glad to talk about some of the innovation we see happening but actually let me let Ping talk a little bit about Excel's motivation, what it saw happening in the market that made this a sensible, a no-brainer thing to do. Sure, and as you know, John, Excel has always been very thesis-driven in terms of the sectors and categories you want to spend a lot of time in. So the big data fund, although it's a new fund, it's not a new area of focus for Excel. Obviously our investment Cloudera predates this fund by quite a number of years and we have a lot of other companies, Couchbase and others that are innovating up and down the big data stack. And the fund was really an opportunity to capture and stimulate innovation on top of the Duke platform and other big data platforms. It's very rare to have a platform company that comes around every decade or so whether it's the iPhone, it's Facebook, it's Google, it's VM where these platform companies are really create a bow wave of innovation behind them and around them, right? And I think that's what a lot of the big data fund is about. Through our involvement in Cloudera, we have seen how Hadoop has been deployed at enterprises far and wide and for applications that we probably didn't realize was possible before Cloudera got going. So I think this is to further that and complete that cycle with more applications built on top of Cloudera and actually underneath Hadoop as well. We asked Frank, why did it take so long? And because it's kind of obvious, but you're first, give us some insight there. You know, I think it takes a while for these technology platforms to slowify and become fertile enough and mature enough for entrepreneurs to really realize that there's value to be built on top of them. And I think it takes time for startups to kind of emerge. And I think just in the last six months, I've seen an increased level activity in the ecosystem. That's kind of what drove it. We were kind of not, you never want to be too far ahead of the curve. You kind of want to time it when the entrepreneurs are ready to go. Just a couple quick questions on the fund. Is it a new fund? Is it like limited partners separate fund or is it part of another fund? It's very much going to be managed by the general partners that excel and it's going to be part of our existing funds. Well, you know, as you know, we have multiple billion dollars in capital worldwide and this is really a global initiative. So our Europe, China and India partners, we're working closely with the US partners. So we'll invest out of those funds but really dedicated to forming a coherent portfolio of companies around big data. Are you guys, is Cloudera part of the fund? Are they going to be advisors? Is there a set of advisors? Cloudera predates this fund but obviously along that thread of big data that we've been lucky to be on. ClickView is another company we took public last year that was invested years ago but is also part of that thread. Infusion.io is another example but these are going to be for new companies that have yet to be funded. So essentially the Excel family, whoever's in the Excel family essentially is an advisor. Pretty much what you're saying. That and also we brought together some industry advisors as well. I think, you know, not just industry but also data scientists, open source experts like Doug Cunning. We have research, we have a professor from Stanford, Computer Science, Jeff here who's a thought leader in data visualization. So we've had a lot of kind of external resources to help really drive innovation, really articulate where the opportunities are. Hilary Mason, who's the CUBE alum. Yeah. Oh, those names on the slide you presented. Yeah, Hilary Mason at Bitly, so. Yeah, that was a good list of folks up there. Yeah, I'll say from Cloudera's point of view, we're tremendously excited about this fund. We believe that Hadoop adoption is going to be driven by applications, analytics, visualization tools. In order to do what I told you guys earlier, in order to build business value, we really need innovation in the applications. Now the platforms arrive, we need apps that make it accessible. So while this isn't a Cloudera initiative, it's one that Cloudera is enthusiastically excited about. I'm thrilled that Excel has done this. I think that the venture community generally realizes the potential of big data. I think that it's only now that we're starting to see interesting application analytic biz tool investment opportunities come into existence. And that's because the platform had to achieve some ubiquity outside the web space before that could happen. Yeah, we were talking to Kirk earlier about the competitive stretch. Obviously the competition that's entered the space, you guys have been untouched for years and then all of a sudden that's extremely positive validation. And we talked about, okay, what do you guys do and how does he spend that $40 million to compete? And we kind of brought the analogy of Google, right? Google was number one in search and how much more market share can you grow unless you add more users. And their whole goal was to add more users. So he was pointing out, by adding more Hadoop roles, Cloudera's role will increase. So that seems to be the competitive strategy and this fund plays into that. Is that correct? Yeah, absolutely. And I actually think the rise of the data scientists is probably one of the most amazing transformations that have been happening over the last couple of years and people like Hilary Mason and Jeff Hammerback are really pioneering a whole set of thinking around how to use big data for new applications. And I think it's going to be more than just data analytics and data visualization. I think collaboration, CRM are all traditional software categories that are going to be reinvented on top of Hadoop. So I think it's exciting. What kind of investments are you looking at up and down the stack across the board? Anything in particular hot? I mean, I think the charter is to really anything in the ecosystem. So we are obviously looking at things that are more kind of infrastructure like cloud storage and areas around automation and data center technologies that can make the data platforms be more effective, but also the applications on top and that's going to be as broad as a project management tool to a collaboration tool to a mobile application that uses all these technologies. I think we're only going to start to see new opportunities as big data becomes ingrained in our thinking. The explosion of data from mobile devices, from sensors, from systems in the world, machines talking to machines, has really driven the need for storage capture analysis. The question is once you've got that data, what kind of questions can you answer? To Ping's point, data scientists can help us answer those questions. On the other hand, the expertise, the insight of those data scientists needs to be rendered into tools and to applications into the next generation of infrastructure that ordinary mortals can use. I don't think I'll ever be a data scientist. I love hanging out with them, but I want them to reduce their expertise to software that I can use to run my business, to run my life. Mike, you're technical, though. I mean, it's cool to hang out with data scientists because they're smart, it's always cool to hang out with smart people, but they're really changing the game and we heard that as Kirk was saying, you know, it's disruptive when it's hard to find people to hire when they don't exist, unlike the DBA market, which you're familiar with, but what do you envision the enablement above the platform to be? I mean, technically speaking, is it what's the big opportunities for startups and entrepreneurs? Is there any glaring white space that you see that, because Cladier is an operating system, essentially a platform, a lot of come-on machines, you got to tie all that together. What are the white space opportunities that you see? If you've got a deep understanding, if you, in your gut, understand a data type, a data source for some particular vertical market, from some particular device or class of devices, if you know where that stuff comes from, what it can tell you. That knowledge, I think, applied to this platform, to big data in general, rendered into useful applications is gonna be the big opportunity. And I can't say specifically what those are in financial services, you think, there are some of those opportunities in healthcare? Absolutely, but really the question is, can we get people with taste and judgment about data, productive with this new generation of platform technology? My bet is there's a lot of money to be made, not just by the investors, but by the entrepreneurs and the people who get those products and apply them to their business. That's a great point. I mean, Ping, I'll ask you a question to riff off Mike there. I mean, in the venture business, it's an old saying, oh, that's a feature, that's not a company. I mean, what he's basically saying, and what we're seeing in the marketplace, is that someone who's actually uniquely aware of data, a feature, if you will, can build a company, because you could actually use the speed, the competitive advantage. You can actually make a lot of money. You didn't have to have the whole platform built if there's a turnkey platform. Does that change the nature of the investment? I mean, that whole feature is kind of like a cliche, it's a feature not a company. But you're seeing these lifestyle businesses that are kind of getting some traction, they could always pivot and expand. I mean, a kid who understands big data could create a reservation system possibly in airlines and start an airline and who knows, right? I mean, this is kind of a new thinking. I mean, it's a mindset. Yeah, I think a lot of, I think it's a great point because a lot of the interesting software application companies feel like features when they first get going. It's something that's very lightweight, easy to use, gets adopted in the consumer or enterprises hand without a big sales force. And before you know it, everyone's using it and it becomes a product slash platform, right? And I think that is something that we're, it's a lot of the consumerization of IT that we are investing behind is really about building products that are consumable and therefore a lot of times like a feature, right? And it seems like it sneaks into the enterprise before it actually, people realize they have to buy it. It's viral, right? It's viral. I mean, you know, what is Dropbox? When it got going, it was felt like a feature, right? But now it's your filer. FTP client. Now it's your FTP, now it's your filer, right? It's now your storage infrastructure for the cloud. So, you know, I think that's going to be a wave of software innovation that, you know, we have to sell. That's amazing though. But this changes the whole venture game, right? It's like, you know, don't discount these little new innovation ideas to spark at the scene. No, we have, you know, a lot of our investments these days are behind companies that have, didn't raise any venture capital but have gotten a certain set of scale by now looking for growth capital. It's because they're able to get going without a lot of heavy weight, you know, development in order to develop the initial insertion feature. I don't want to, I don't want to minimize the challenge in starting and running a business. Certainly it's a challenge I've lived repeatedly. I do think it's easier to start a company for a bunch of infrastructural reasons today than ever before. But in the big data space, I will also say the ability to work with data at scale, to basically deploy these tools, to get it on a platform that makes it manageable is brand new. And what that means is even some relatively simple ideas are gonna be transformative. I think you're gonna see some really exciting new companies start now with deep, deep insights about how to do processing. But by using brute force across more data than ever before, I mean Peter Norvig at Google famously said, we don't have better computer science, we don't have better algorithms than you guys, we just have more data. And I think you're gonna see important innovations be driven by just scale. Simple algorithms apply to absolutely everything you can eat. I think it's pretty exciting. Dave is a new development kid. I wrote a post three years ago on that. I think that's now come true. Just to switch gears, Mike, you have a lot of experience. You sold your last company to Oracle. We had some guests on earlier in theCUBE. We're talking about the relational database market. How there's some similar parallels to how that developed, very disruptive. Early on, some of the core table stakes stuff was developing, the minimum features and then the replication, all that stuff came in. What do you see right now relative to that? Is it a good analogy and what can you draw on from your past experience to apply to the Duke market around the evolution and the next steps? I'm fortunate. So when I was a young programmer in the 1980s, very early 1990s, I got to watch the development of the relational market play out firsthand, not as a company leader, but as a participant in the market. And as you say, there was this broadly useful horizontal platform that could solve a whole bunch of different problems with, let's be honest, in the early days, not that many apps, you had to kind of roll your own. As soon as a core group of vendors, IBM, Sybase, Ingress back in the day, later Oracle, invested in making it consumable, manageable, monitorable, they created an ecosystem of partners of other vendors that built value on top and look at the size of that market today. I think we're in exactly that spot with big data in general, with Hadoop in particular. The Apache Hadoop community has created something tremendous. The opportunity for newly funded businesses, for established companies to move into this space, to deliver that innovation on top, and to participate in that ecosystem, to be one of the companies that makes value, that makes a living, I think it's gonna drive this market to every bit as big as the relational market. I will say that I don't see them as competitive in particular right now. The applications that relational systems have solved for 30 years were, they grew up alongside those systems, they co-evolved. So it's unlikely that Hadoop would outperform those on old guard workloads, but big data, complex data, sophisticated analytics, Hadoop's game changing there. So to follow up on that, I mean, everybody loves to put Hadoop into an analytics bucket. You guys forecasting that that is gonna change and evolve into maybe even transaction oriented workloads and how do you see that shaping up? I think there's plenty of homework for us to do on the Hadoop platform, making it more scalable, more reliable, more consumable today. I don't think we need to be eating anybody else's lunch for the time being, and I think the green field opportunity of big data is so enormous that there's plenty of room for innovation. Over the course of five years, who knows what happens? In general, technology fractures, explodes into lots of variability for a short while and then begins to coalesce for the single platform again. I mean it's like you had side bay oracle, you had all the early informants. Yeah, well and there were network databases and I mean there were lots of different data structures. The relational market, over time, offered the critical features to its customer base but it did take some time and in the meantime, lots of innovative thinking, lots of new value creating. So right now I think we're in that explosion of Darwinian variability. And over time, we'll see who wins. Certainly it's a leader. And I think the relational database won the war in many cases because the application developers embrace them and they completed the value proposition to the user. And I think that's really what is exciting if I look forward in the next couple of years. Who's gonna, I mean there's gonna be a role of people who are building applications that don't bring up Hadoop because it's assumed to be a part of the platform. And I think that will be when it's become truly mainstream. Yeah, I think that's right. Nobody bought side base or ingress back in the day because it was easy to manage. Likewise, nobody buys cloud-era enterprise because it's easy to manage. They want the Hadoop platform for analytics. Yeah, we need to make it easy to manage, but that's merely to enable the application of value or job. I mean it needed to work as solving obvious problems that could never have been solved before. Okay, quick question. Then I have the Linux questions so you can start noodling on Linux and Hadoop, that big comparison. We're gonna drill down on that. But my quick question is Mike and Ping, what surprised you between last year and this year? Is there anything that's jumped out at you besides all the trend data? Other than the fact that Oracle's here. And actually, Oracle owns the promoted tweet keyword, HW2011. I'm sorry, yeah. Thank you, Oracle, that was awesome. Yeah, we didn't get any money out of that. So Oracle sponsored SiliconANGLE, we need the sponsorship. No, seriously, I mean, outside of some of the trend data, what couple things, or if there's a few things that really surprised you this year that you said, wow, I didn't really think that was gonna happen. We always believed at cloud-era that Hadoop was a big deal and that the rest of the market would recognize that. It has been breathtaking how quickly that realization has dawned. Large established vendors, Oracle and others have made announcements about Hadoop adjacency, or in some cases, Hadoop platforms, Hadoop analytic offerings, Hadoop applications. And not just emerging startups, but big established vendors. That pace, that explosion of interest at scale has been wonderful, a little surprising, but wonderful. For me, the only thing I would add is it's been completely energizing to see how it's not a Web 2.0 phenomenon. Hadoop has always been tied to Web 2.0, it's about Google, it's about Yahoo, it's about Facebook, and... Doesn't apply to us. It doesn't apply to us. And now, when I talk to enterprise CIOs and IT folks outside the valley, I don't have to explain Hadoop anymore. They almost say, yeah, we know Hadoop. Give us a data scientist so we can build these four applications. That has been an incredible shift in the last 12 months. It's not a valley thing anymore. It's mainstream. Well, I'm surprised. Last year, we were joking, Mike, we said, you know, big data revolution, you know? And we really kind of felt that, but again, I agree with you, it was a massive surprise, and we're psyched to be in the front row to see to it. But when we talk to people out there, they like to put things in little buckets. Oh, it's like Linux, the red hat of this. When it's actually not actually shaping this, a little bit maybe some comparisons. And Almer was breaking this down before he was going to come back at 2.30 to kind of break that down further. Can you elaborate more? Some technical similarities, but differences. Business model similarities, but differences. Sure, so let me begin by saying one striking similarity to me is the Hadoop platform and the Linux platform. At the heart of what we call Hadoop is the Apache Hadoop project. It is surrounded by complimentary open source projects for data loading and for query and for coordination and so on. All of it from the Apache Software Foundation or Apache licensed, all of it integrated. That collection of software together is consumable. Just like the Linux kernel from kernel.org all by itself is of no use. You need Red Hat, you need Debian or someone else to surround it with the apps and with the tools that make it useful. So that's a very striking similarity. Hadoop as a platform is broader than just the Apache Hadoop project. It's a bunch of complimentary projects. Our business model is different from Red Hat's. We're concentrating on management and monitoring administration and not on a pure services and support subscription business. We do have a diverse product line that's got proprietary stuff and open source stuff. But I think that we're committed to the open source platform to making sure that analytics and data storage at the heart of the platform are open source and that's because it's the best way to collaborate it's the best way to drive adoption. And frankly it's what customers want. They don't want to be locked into a single vendor for the long term. Maybe another way that we've looked at it is you know Linux was really about commoditization. It was taking existing open source, sorry operating systems and commoditizing them to save money and create an ecosystem around that. That's not what Hadoop's about. Hadoop to me at least is about value creation. It's about enabling new applications to solve new problems and help enterprises make more money. Or analyze their data in a way that can develop new features. So I think it's very different from a kind of paradigm shift in terms of what they're trying to achieve. And therefore the business models and I think the ensuing ecosystem will be quite different as well. She's saying people weren't getting a sustainable competitive advantage from Linux. There were maybe cutting costs and lowering TCO. People are getting a sustainable advantage from data. That seems to be the new source of competitive data. I think that's right. And in fact I would say if you look at open source historically, MySQL and Postgres we totally understood relational databases. We just built some free ones. Linux operating systems were not rocket science. We just built a free one. JVOS, look, middleware was well understood. We just built a free one. Hadoop, we're not knocking anything off. This is an innovative platform. It is chasing nothing at all. It's creating this new capability, this new market. It's a pretty exciting place to be. That's awesome. Well said. I think that was really well put. That's exactly what it was doing. It's creating lift to kind of like glass sealing that the web hit us with. It brings more to it. And that impacts everyone. We were talking last night at dinner. It's like, it's not just the web 2.0 companies. The web is everybody. It's mobile. The cloud is going to be a big part of that. The more connected you are, the more data there is. Okay guys, we really appreciate your time. I know you're super busy. Mike, congratulations on the funding. Pink, congratulations on the $100 million fund. Congratulations on all the success. It's been great to be a part of it. I want to thank you personally for the collaboration this year and your office year and a half. We've been in your office. We're moving on to find some new space. Appreciate you guys are growing out. Growing so fast and it's been great. You've enabled us to have an opportunity to be part of it and we're excited, thanks. Well, we appreciate the chance to be here, to be on theCUBE and also I want to thank everybody who managed to get to Hadoop World this year. We look forward to seeing you next year. It's going to be a great show again, I'm sure. Thanks.