 Big data is big business. Good day everyone, this is Dave Vellante coming to you from Wikibon's headquarters in beautiful Marlboro, Massachusetts. We are live on theCUBE, SiliconANGLE Wikibon's flagship video product where we bring you all the news and action from the world of enterprise technology. Now today, we're here to talk about big data and we have some big news. Wikibon has just released a groundbreaking study quantifying the size of the big data market. Now to our knowledge, this is the first study that actually forecasts the size of the big data market with actual vendor market shares. Now, I wanna go through some of the findings of the studies. First, the worldwide big data market is estimated at $5 billion in 2011 and that's predicted to grow to more than $50 billion by 2017. That's a whopping 58% compound annual growth rate. Now according to Wikibon, IBM leads all players with more than a billion dollars in big data revenue while HP's Vertica division leads what we call pure play innovators with a 28% share of revenue generated by these smaller, more focused big data companies. Hadoop platforms are powering many big data innovations and driving adoption in emerging application segments. Now services accounts for the biggest chunk of big data, 44% of the spending followed by hardware, 31% in software which is 25%. Next generation data warehouses are upending traditional enterprise markets with massively parallel, columnar analytic databases that deliver faster load times and near real time analytic capabilities. Now I'm pleased to have Jeff Kelly on the line who's with me. Jeff was the primary author of the study and Jeff is also Wikibon's lead big data analyst. Are you there Jeff? I am, thanks Dave for having me. You're welcome. Now Jeff I have to ask you, five billion, that's a big number as a starting point and I'm sure a lot of people are surprised by this figure. How is it that big data, such a new and emerging market is so large? Oh well Dave it really comes down to the definition of big data and we decided to cast a pretty wide net and include spending on products and services from both traditional vendors and some of the big data peer plays that you mentioned. Okay so what is Wikibon's definition Jeff of big data? Well we keep it pretty simple Dave. Wikibon defines big data to include data sets whose size and type make them impractical to process and analyze with traditional database technologies and related tools. So as a result of that definition our figures include technologies, tools and services designed to address this shortcoming. So this includes Hadoop distribution software, the related to Apache Hadoop sub-projects and related hardware. It's about the next generation data warehouses and the related hardware there. It integration tools and platforms as applied to big data. Then of course you've got big data analytics platforms and applications and data visualizations again as applied to big data as well as the portraying professional services in the big data space. So for example, would you include Oracle Exadata in these figures? Yes, that's right, we would. All of Exadata? No, not all, but a reasonable size portion actually. Okay so for example, how did you determine what that portion would be? Right, so well like all the vendors we ranked we used a combination of factors, a way to combination of factors to determine the percentage of Exadata revenue that we qualify as big data. Those factors include of course media reports around Exadata deployments. We talked to a lot of members of the Wikibon community and we also use extrapolations based on Oracle's Exadata pipeline. Ultimately for, in the case of Exadata, those Exadata deployments that qualify as big data did mainly due to the volumes of data involved rather than on the type of data involved which is unstructured data. Okay, let's go to some of the graphics that we have from the report. The first one that we want to show you is the forecast of the market. Now I realize it's a little small, a little hard to read, but the important part of this chart is that the shape of the curve is what we call an O-Give curve or an S-Curve. Some of you may be familiar with that terminology. Now the way an S-Curve works is on the vertical axis you've got effort and on the horizontal axis you've got return. And so when you're in the flat part of the S-Curve which is kind of where we are now you got to do a lot more heavy lifting to get a lot of value. Now as you start to get to the steeper part of that S-Curve a little bit of effort gives you much, much greater value. And so we're predicting, if I understand this, Jeff, properly that within the next couple of years the steep part of that S-Curve is really going to hit over the next two years. Is that right? Absolutely. Yeah, so now what you see of course is that's really where big data goes mainstream and then you see this really strong forecast that shows up to $50 billion by 2017 which is a significant growth market, Jeff. Oh absolutely, we think big data has huge potential. Especially once we hit that, the mainstream adoption that is moving beyond the web vertical market that is revolutionizing big data, we think it's really going to just skyrocket. Now let's take a look at the chart, the bar chart which shows some of the pure plays. Jeff, you chose to segment out the so-called pure plays. So talk a little bit about that, why you did that and talk about some of the vendors that are leading there. Right, well we decided it was a good idea to break out some of the all the pure play vendors in the big data space because these are the vendors that are really doing a lot of the innovation, most of the innovation in fact, in the big data space. So we broke them out and these vendors include Hadoop distribution and software vendors. As you can see, CloudData is probably the leading Hadoop distribution vendor in terms of revenue with $18 million at this point. Vendors like these are really working very hard to make Hadoop a more solid, stable platform capable of closer to more real-time type of analytics, although there's a lot more to go there. So we're improving security, as I mentioned, stability and increasing performance and kind of bringing Hadoop to that point where enterprise is comfortable deploying it in their IT environment. Now, Dave, if you wouldn't mind putting up the pie chart with the pure play vendor share. Now the pure play vendors accounted for about 300 million, Jeff, and you had Vertica leading with just over $80 million in revenue, about 28% of that pure play. But HP acquired Vertica, why did you choose to put the likes of Vertica and Green Plum and AsterData into the so-called pure plays? Well, we decided to do that because the acquisitions are fairly recent and to this point, these vendors really haven't been, I wanna use the word, we use the word polluted in the report. That's a good word, okay. They haven't been polluted by their acquirers, is what you're saying. Exactly, so they're still being allowed to operate fairly autonomously and then we think that's a good idea because, I mean, these are the vendors that were really driving innovation in the data warehousing space and we don't get any reason to get in their way. So this is another segment of the pure play market, the massively parallel, in most cases, columnar analytic databases that focus more on structured data but large volumes of structured data coming in and making, loading that data very quickly and making it available for real-time analytics. Okay, that's good. Now, Dave, if you wouldn't mind bringing up the factory revenue by hardware, software and services, I wanna share with people the services revenue is the biggest piece of the pie and, of course, IBM is one of the leaders there, they're the largest service company on the planet. And then you see, so services has 44% share, hardware 31% in the remaining chunk in software. Why isn't software larger, Jeff? Well, part of the reason for that is, of course, that Hadoop is an open source framework. Even most of the enterprise, or I should say, most of the big data pure plays that are commercializing Hadoop, are still giving away the software for free. That is, most of it's Apache compatible, free to download software. They are monetizing Hadoop around the services aspect in terms of managing Hadoop and training at this point. Most of the Hadoop or the software is for free, so you're not getting revenue directly from the software. Yeah, so you'll get it from aftermarket services, like you said, training and certification and other services. Kind of the red hat model, if you will. Now, Dave, can you go to the big data vendor table? Jeff, you've quantified shares for more than 40, I think it's even more than 50. You got a category called others and I understand there's like five or 10 in there, so it's more than 50 vendors that you've categorized here, haven't you? Absolutely, and we think it's a large and growing market. I mean, there are new big data startups popping up almost every day, it seems, for good reason, as we talked about earlier, we think this is gonna be a huge market with the potential to impact pretty much all vertical industries. So yeah, it's a pretty big market at this point. We've got all the major enterprise software and hardware companies really getting involved, as well as we've mentioned all the pure play vendors, and we think that number is in fact gonna grow in the short term. In the long term, it will probably shrink in terms when consolidation occurs, as it usually does in a hot IT sector such as this. We mentioned the report, and we think it might be somewhat similar to what happened in the business intelligence market for a call a few years ago, where a lot of IT whales acquired some of the BI pure play. So we think that probably happened in the long term. In the short term, with all the capital, the venture capital available from the partners Big Data Fund, for example, and other sources, we do expect to see a lot more startups hitting the market in the next year. Okay, thank you, Jeff. Now, as far as the report goes, how can people get a copy? Do they have to pay for it? Nope, the report is not for sale. People, you can just go right to wikibon.org, search for the Big Data Market report, and you will find it right there. Awesome. We'd like to keep things free and open here at Wikibon. Please check out the report. Go to Wikibon and you can in the search bar just look at the Big Data Market or Big Data Market revenues and they'll pop up there. Jeff, thank you very much for coming on. When did you start this research? Doing this research, I'd say for the last six to nine months, really, pretty much since I joined Wikibon, we've been looking into this type of report and been just talking to as many end users, members of the Wikibon community, including vendors and partners, end users, customers, VC, you name it, we've been talking to them. So this has been quite a long effort to put together this report. Excellent, well congratulations on getting such a great piece of work and I know you've got a busy schedule ahead of you. Of course, we're going to be at Strata next week. WatchForce there, the Big Data Show. We're partnering up with O'Reilly Media and of course our partner at SiliconANGLE will have theCUBE there. So look for continuous coverage. We'll be there on Tuesday and Wednesday and Thursday. We got the big team going out there. Jeff, nice job. Thanks for coming on theCUBE and thanks everybody for watching and we will see you next time. Bye for now.