 from San Jose. It's theCUBE, presenting Big Data Silicon Valley, brought to you by SiliconANGLE Media and its ecosystem partners. Good morning, everybody, and welcome to Big Data SV. My name is Dave Vellante, and this is our 10th Big Data event. We started in New York City. We've done five now, and this will be our fifth in Silicon Valley. We've done five in New York City. And we started SiliconANGLE and Wikibon covering the Big Data space in 2010. We did our first Hadoop World, which was actually the second Hadoop World in New York City. In 2011, we put out the industry's first Big Data report, and it caught the industry by fire. It was the hot topic. The concept of Hadoop was profound in that the idea was to take five megabytes of code and bring it to a petabyte of data, metaphorically, if you will, because moving data around was so problematic. And that concept really took hold. We asked questions at the time, who will be the red hat of Big Data? Is this going to be a winner-take-all market? Will this trend, this Big Data trend, solve the problems that decision support and business intelligence couldn't solve? We're going to talk about that today and throughout the week. We've just released Wikibon's Big Data market study and Big Data market shares and key findings. I'm here with Peter Burris, who heads up the Wikibon Research Organization and George Gilbert, who leads our Big Data research. Gentlemen, welcome to theCUBE. Good to see you guys. So, we have this open source marketplace. It's been plagued by complexity, competition. The cloud really changed things. Peter, you've been studying this for a while. You just dropped that awesome report on wikibon.com. What did you find? What were the key trends that you saw in that report? Lay it out for us. Well, the most important trend is that users are starting to drive what happens in the Big Data universe. For many years, it was the individuals that were primarily responsible for creating a lot of these open source tools. And in the process of creating these open source tools, they solved each other's problems as opposed to solving user problems. Users then found themselves, or enterprises found themselves building out clusters, deploying Hadoop, really focusing a lot on the infrastructure, which had its pluses and minuses. But what we see happening in the marketplace today really is an emphasis, a bifurcation of what, in the Big Data space, where we're seeing a continuing focus on the infrastructure elements. And we'll spend a fair amount of time talking about what that means from a hardware, database, and related technology standpoint. And then a much more focus based on user and enterprise experience of how to turn this into applications that actually have a consequential impact on the business, on machine learning, AI, how the pipelines work, how the personnel work, integrating business, changing the way business thinks about the role the data's going to play. And that bifurcation is going to carry forward over the next few years as we gain more experience. And the entire industry is going to have to, is going to go through a process of restructuring itself to serve both sides of those needs. Great, so, George, I want to ask you, so this is not a winner-take-all market. There is no red hat of Big Data. Certainly it's not cloud era. Hortonworks kind of threw a wrench maybe into some of those plans and trying to play the long game with a pure open source play. The return on investment of Big Data oftentimes turned out to be a reduction in the denominator, a reduction of investments, if you will. Lowering spending relative to traditional data warehouses. Ask you, you've been following this business for a long time. Did the Big Data promise fail to live up to expectations? There are multiple layers to that question and to the answer. I would say that the sort of let's take out some of the, offload some data warehousing processing was the application that IT could attack to justify their experimentation with Big Data technologies, which remain notoriously complicated to provision and to manage on-prem. But as Petey was saying, to get sort of more value out of this investment, we're sort of now bumping up against the complexity of all the data science pipelines, whereas before we were bumping up the complexity of administering these Hadoop clusters so now we've got the data there. It's kind of hard to manage, but now we have to sort of learn how to apply that using much more sophisticated techniques. It's interesting that you say denominator shrinks because the cost of operation as you move to the cloud, there are many more options and they're managed much better. So that cost comes down as people have more cloud options. The last point I would make is I do think packaged applications, whether they're from the big guys or a lot of vertically focused or even semi-custom apps from folks like IBM or Accenture, those are going to be what drives sort of mainstream deployment to reach hundreds of millions of users of this technology. So I would just observe that, in my view, this whole Big Data trend wasn't a failure. We observed early on that the folks that were going to make the most money in Big Data were the practitioners, not the vendors. So we made a correct call there. In many respects, I look at this as, you know when you paint, you got to prep. I feel like this big last eight years has been the preparatory phases, scraping and getting things ready, getting your sort of house in order. And now Peter, we're setting up for the digital business era and the digital business era is about data, it's about applying machine intelligence. It's certainly taking advantage of cloud economics. Do you buy that premise that we are now in a position to actually, many companies anyway, or some companies to affect digital transformation? Well the whole concept of digital transformation starts with the idea of data and our observation here at theCUBE and Wikibon ultimately is that the difference between a business and a digital business is a digital business uses data as an asset and that is an enormous implications and operations, how you engage customers, how you institutionalize work, what your relationships are with technology companies, et cetera. So, but that core concept of using your data differently and creating value is absolutely essential to this notion of big data and all the various things we're talking about because big data is the process by which you create business value out of data. That's ultimately what we're trying to do with all this stuff. So to George's point, if we think about where we've been and where we're going, in many respects, fundamentally, it's kind of a, we're just kind of following it, almost a normal adoption process. So if we go back 10 years to Yahoo, Google and some of the tech companies that initiated a lot of this motion, they had very specific types of problems that they wanted to solve. They had enormous volumes of data that they wanted to use to solve that problem and they created technology to do so. Where we kind of got hung up is in the diffusion out of those relatively, certainly very challenging and very rich set of problems that Facebook and Yahoo and everybody else had, as they tried to diffuse that technology into other industries, we got caught up in the bumps and we had more failures and we didn't get the returns we wanted. So now what's happening is a lot of that domain expertise is coming back in. We're starting to say, now we know how to solve the problem, we have an approach to how we're going to solve the problem and the technology is being snapped into place to solve problems as opposed to technology being snapped into place or solve business problems as opposed to technology being snapped into place to solve the technology problems of big data. So we're here talking to Peter Burris and George Gilbert, two analysts at Wikibon. We're here at the Forager in San Jose. It's at 421st Street and the Cube has a week long, half a week long anyway, set of activities going on. We've got an event going on this evening. I think it starts at six o'clock, so come by. We've got a breakfast briefing tomorrow where the Wikibon analysts are laying out the recent market studies. We just dropped two market studies on Wikibon. One is the overall market size and the other goes into market shares. I want to touch on those briefly. We're looking at about a $35 billion market growing to 100 billion over the next 10 years. As we observed early on, open source software had an effect where most businesses, most industries start off, software's a big component of it. Because of open source, the software revenues were muted in this business, but they're really starting to pick up now. There was a heavily services-oriented business and still is, about 40%, right? And then software comprises about 30% and hardware about 29%. You guys see that changing over time, correct? Well, yeah, and in many respects, again, this is following almost a natural evolution that's made more interesting by the fact that these are very complex problems and new types of business problems, but certainly George has done a lot of research on this. Ultimately, what every company that operates in this space should be thinking about is how is the industry in aggregate going to get to 100 to 200 million users in the next decade? Where a user is not someone who's, you know, playing with the data or looking at Tableau, but a user is fundamentally someone who's using an application or making a decision that's informed by data that's made possible by these tools. And that's not something that's going to happen at a very, very low hardware cluster database level. It's going to happen elsewhere. And one of the big trends we see is that there's going to be a lot of new packaged applications entering into the marketplace that consume these tools and make them viable for businesses to actually use. Well, George, in 2012, Mike Olson declared the year of the big data applications. That never happened. It really, the application, the action in software has been around database and software infrastructure. But what do you see in terms of the evolution of that software business? Well, continuing on the theme of the bifurcation. So it was interesting to hear Peter talk about how the infrastructure that the big tech companies and internet companies sort of developed as a byproduct of building their own services, that stuff didn't work for mainstream enterprise, it didn't even work for most of the sophisticated enterprises. So on the infrastructure side, what we're doing now is we're seeing a convergence where we're putting those pieces together in a way where they fit easily together enough so admins, mere admins, mortal admins and developers can work with them. With the cloud being the ultimate convergence. Yes, yes. But then, and I would also say then, it's the applications will really take it mainstream because even when we fit the platform stuff together, it's not going to be enough to go mainstream. Okay, and we got a wrap, but I just wanted to touch on some of the market share stuff that you guys just produced and we'll be presenting this data tomorrow morning, Thursday morning here at the Forager, it's 421st Street in San Jose. Not surprisingly, IBM came out as the leader because of the large services component, they got about 8% of that. Well, they play in all parts. They play in all, but services, they dominate. So IBM, Splunk actually, who never used the term big data when they were, during their ascendancy, they didn't tie into that meme, but they are a big data company. And an example of a packaged application company leading a charge. Absolutely. Both. In-apps. In-apps, right. Dell, Oracle, and now if you look at this, that's the overall, if you look at the software top 10, Splunk comes out on top, then Oracle, then IBM, and we'll be getting into that tomorrow morning at the breakfast. Peter Burris, George Gilbert, thanks so much for setting this up. Thanks for watching. We've got wall-to-wall coverage here. This is day one big data SV from San Jose. You're watching theCUBE. We'll be right back.