 And that really is what it's all about. Live from the Fairmont Hotel in San Jose, California, it's The Cube at Big Data SV 2015. Hey, welcome back to day three of Big Data SV. Live here on The Cube, we're in San Jose. I'm Jeff Kelly with Wikibon. I'm joined by my co-host Jeff Frick. Welcome, Jeff. Thanks, Jeff. Another big day here, third day going strong in San Jose. We've had an exciting first two days. I think we probably had 20 plus interviews, 20 plus guests ranging from some of the vendor executives we've had on practitioners from Halliburton from Simply Hired. We had our great panel discussion with VCs on Wednesday night as well as I presented some Wikibon research and of course our party. So it's been a great couple of days and we're going strong here in day three, Jeff. What are your impressions over the first couple of days? It's amazing how far this market has moved. I think one of the great lines that Bill Schmarls had from EMC yesterday, right, is no one's asking about Hadoop anymore is it a technology that we should incorporate? Yes, it works, people are using it and really it feels like a really rapid adoption into the production side. And we've had a number of people really talking about real simple specific use cases in hospitals and grocery stores and oil and gas. So it's really interesting to see the evolution of this show and kind of the evolution of the conversation from the technology conversation to really the business case and the use case application. I was at one of the events last night. There was a number of events going on, obviously it's a lot of scene and I think the next great conversation is going to be into the ethics. We were at a show the other day and someone brought up, if you run an algorithm on a data set, are you responsible ethically for the outcome of that data? And of course the example everyone uses over and over and over again is the target example. Just because you have that data, should you use it, are you responsible for the algorithm if you're the one that actually executed it for the output? But I think what gets really interesting with Hadoop is the whole value is that you don't necessarily know what the algorithm's going to be today or tomorrow but you can grab all the data and hold it. So what kind of responsibility will that entail? So I think that's interesting, the Internet of Things which was the answer to the question last night really in terms of self-driving cars and we're hearing about the Apple car and the Google cars are driving all over the place. Then that really ramps up the security and the access of somebody takes over your car while you're rolling down the road. So I think that's great. And I think your conversation on the money, so I think that was a great conversation. Uber announced another billion dollars in funding yesterday. Just massive amounts of capital coming into this place. What are they using it for? And really how are they going to put that money to work? Yeah, I agree with a lot of what you just said. I think some of the themes I'm seeing here, definitely, as you said, the question is not about what is to do, how does it work kind of thing? It's moving on, how do we use it and how do we harden it for the enterprise? Addressing issues like governance, making it easier to build applications on top of it. That's kind of what the whole announcement around the open data platform, which was kind of the big buzz this week, the biggest probably news announcement. The open data platform, this new industry consortium that's focused on basically hardening the Hadoop core and allowing enterprise vendors to build applications on top of that standard core so you can move applications from one vendor to a next. And really the goal, of course, is to spur adoption in the enterprise where the enterprise soothes some of their fears that we could potentially get locked into one vendor or another or soothes their fears that there's going to be a lot of heavy lifting on a manual work they're going to have to do to build applications. It's going to open up hopefully more application development from others in the vendor community, whether it's the big guys, the startups or the enterprises themselves. So that's one of the themes we've been following, I think was a pretty important announcement. In fact, we'll have Pivotals coming on this morning. Sonny Madra is going to come on, he's their director of data products. We'll talk about the ODP and we'll talk about some other announcements they've made. Of course, on the other side of that equation, you've got the big, I guess, industry player that was not part of that consortium was Cloudera. So we're going to have on Tom Riley, who's the CEO Cloudera today. We'll get his point of view on it. Of course, on the ODP side, it was really led by Pivotal as well as Hortonworks, IBM and some others. So we'll work through that today. We'll try to suss out some of that controversy, if you will. Some of the other things we talked about, you mentioned data science applications. We had two great data scientists on yesterday. And they kind of echoed what you said, that they're starting to see it's less about the container, the storage, the processing, and more to how do we use all this data we can now process and to do, how do we actually use that to build some applications that drive some business value and beyond some of the more commercial uses. I mean, Bill Schmarzo mentioned some things around hospital usage and path to surgery was one of the examples he gave and he's a big data analytics to hopefully intervene quicker if you can predict that someone's likely to have a negative outcome. What can we do to stop that? Analytics underpins that. That's obviously a big social value as well as monetary value to a hospital. But then, of course, we had on Chris Poolin from Patterns and Predictions to wrap up the show yesterday that he was talking about. He led the Durkheim project, just really using social media data to help predict the likelihood of things like suicide risk and other mental health challenges among returning veterans. So it's not just about putting ads in front of people. Some societal value you can really drive with some of these big data technology. So great day yesterday. We're looking forward to another good day today. Should be fun. Yeah, it was great. And Bill Schmarzo's interview, if you haven't watched it, be sure to watch it. It's up on the website, siliconangle.tv. But I think it's really important because people too often get wrapped up into the use case of you walk by the Starbucks and it gives you an offer for your favorite latte. And the thing I thought was also interesting about Bill is he's talking about doing prescriptive action now. I mean, they're trying to get to prescriptive stuff now. This is not futures. And the use case he gives about potentials for infections that happen in hospitals, it's about saving lives. And I think the other thing that he brought up, which is always a fun topic of debate, is real time, right? What is real time? Or how do you define real time? And still I think my favorite description of real time is in time to do something about it. And that's very different based on the use case. It's different based on the data that you need to make that use case. And again, I think Bill reinforced that and Bill is out in the field all the time. Every week, like Tuesday to Thursday, he travels like an eccentric guy. Not quite as much as Ray Wang, hi Ray, but he travels quite a bit. I also just want to give a shout out to the sponsors. We have a great community here. We had a terrific party the other night. We've had a lot of great guests on over the course of the couple of days and we'll have more on today. But this would not be possible without sponsorship. We operate on a sponsorship model. We got a bunch of guys in here. If you could see behind the camera, we got gear, we got crew. We travel, we want to go out to the events and get you to signal. So we couldn't do that without our sponsors. So EMC, thank you very much, our headline sponsor for this event. We've got a number of other great sponsors including IBM, Pentaho, H2O, InfoObjects, Platfora, Syncsort, great longtime sponsor, Trasada, Obimeta, always a fantastic guest. Informatica is on. Wendisco, Teradata, and new sponsors, BMC and Pivotal, who are new to our big data shows, which is terrific to happy to have them on. And of course, Hortonworks is a longtime sponsor. I'm missing about Platfora and Tahoe. It's a great list. Again, we're really thankful and thank you to our sponsors for enabling us to get the gear, to get the crew, to get the guys in to come out to these shows and really give you some insight. Let you hear directly first person from the tech athletes, from the people that create the technology, the people that are building the companies to deliver the technology, and most importantly really is the practitioners that are leveraging this technology to do good. And as Jeff said, in New York, really the theme of that big data conversation is the practitioners are going to be the ones that are going to get the tremendous value. And we keep hearing over and over about 10x, 100x improvements and processes. It's a great thing. Yeah, absolutely. You mentioned earlier, where's all this money going? Because there's been quite a bit of money raised from the vendor side in this market. We talked a little bit about that in my presentation on Wednesday night. And I think the point I was trying to make there is that this is a difficult market. The open source market is interesting because there's a lot of innovation that happens very quickly. But actually packaging that into consumable products for the enterprise can take some time. It can actually be harder when you've got so much innovation happening so quickly. So that's one reason I think you've seen a lot of money raised. It's going to take some time to build the business. And of course just a startup is difficult to build out those channels into the enterprise. I mean that takes time. Of course the other thing is marketing. We've seen a lot of, the big whales are into this. You mentioned some of our sponsors are some of those big companies, EMC and IBM and others. So you see some of the startups raising a lot of money because they've got to battle those big whales in terms of marketing budgets and those marketing machines are cranking away. So the point I guess is that there's a lot of activity happening in this market. You've got big companies, you've got startups. We think we're going to see some consolidation, some acquisitions happening. We've already seen that kind of kicking off. So yeah, it's a fun market to cover as an analyst and as here on theCUBE. And hopefully it's fun for you guys at home to watch. Yeah, and I think we had another huge wave coming. I mean it's just starting to get here and that's the Internet of Things, which is a whole other set of data, right? When you think about the Internet of Things and you think about all the applications you could potentially build, it kind of, it kind of dovetails nicely with what we were talking about big data applications. Generally, Internet of Things is an area where you're going to build applications around that. And I think that's where you're going to see a lot of the value happen. So I would keep an eye on companies that are focused on that space. And again, it relates to what we saw with the open data platform. And that is all those companies that are part of that organization want to, hey, let's just kind of standardize this Hadoop course so we can start building these applications. GE is a member of the organization. So they're obviously very heavy into the industrial internet. We've done some research with them over the last couple of years here at Wikibon. We know some of the potential value that can be delivered from the Internet of Things is just huge. And that's about building applications that actually do something with all that data coming off machines from sensors, et cetera. So yeah, I think Internet of Things, applications, that's where you're going to see kind of the next wave, the next phase of big data. And that's kind of what we're moving into, I think. So what's your take, Jeff? You've done a lot of work on it in terms of kind of the rate of growth of how this whole market is developing based on some of your early projections. I mean, we do this big data show twice a year. I mean, it's one of these things you don't want to wait once a year. Things are moving so, so quickly. How is it kind of tracking to your projections? Are you making some changes? Is there some big surprises on the upside or the downside? What's kind of your take relative to what you thought was happening? And then knowing what you know today, what's it going to look like going forward? I think, if there's anything that's surprising, I think it's actually proving a little bit even more challenging than I think we thought maybe to build some of these new businesses in the Hadoop ecosystem. As I mentioned, the innovation's there. There's no question. Hadoop and the SQL are kind of the, some of the foundational technologies in this big data movement and you're seeing a ton of innovation in that space. But because it's open source, because the innovation's happening so quickly and there's so much competition, it's interesting to see the different players and how they're competing against one another. And they're competing on price in some cases. I mean, with open source, that changes the whole revenue model and how you're going to monetize this stuff. So that's interesting. And I think that might be slowing some of the growth. So we saw Hortonworks come out with their numbers when they went public back in December. Some people were a little surprised that their numbers were not higher in terms of revenue. But I think it just speaks to the challenges of building a business in an open source ecosystem and the idea that you need to be patient when I mentioned earlier, which is why you're seeing so much money raised. Because it's going to take time. Because this is not a standard, the standard way, sorry, enterprise software is typically sold or consumed. I mean, open source is quickly becoming the new reality and it's not just a nice to have, it's a prerequisite increasingly for a lot of enterprises. But the way those enterprises buy software is in the old proprietary world. That's just kind of how they operate. So that's changing over time. And I think, but that's one of the reasons it's taken a little bit longer to ramp up in terms of the size of that slice of the market. But in terms of the overall market, I mean, it's growing at a pretty hefty clip. We had the, we size the market this year, well, at the end of 2014 for the calendar year, 2014, over $25 billion. And we see that growing exponentially over the next several years. And in fact, keep an eye on, we'll have our latest big data market forecast coming out shortly, in just a few weeks actually on Wikibon. So check that out. But yeah, the market's just growing and growing. And one of the things that's going to be interesting as we do some of our research at Wikibon is that we'll be introducing some different forecasts because as the market matures, now you're seeing different segments emerge. And we have to, it changes the way you kind of have to look at the market when applications start coming in. And when practitioners actually start driving more, more value, more revenue, which is who we think are going to be the big winners, you know, we will kind of adjust the, some of the forecasts to take that into account too. Right. The other really interesting play that that is new to enterprise software, and it came up, I think maybe Frank brought it up on the VC panel, Frank from Ignition, is that now with open source, the actual consumers of the technology can run the innovation and can run the development and can roll out new things. And that's very new. And the other kind of interesting dynamic we're seeing is pretty attritionally, startups run the innovation, big wheels wait till they get big enough and there's enough customer adoption and then they absorb them to bring that to market because we all know nobody ever got fired for buying IBM. You know, that's still a factor in enterprise software. But now you've got these partnerships. We've had a number of people on theCUBE this week really in kind of a partnership mode. You've got the open source organization around a project and then you have the consortia, which is kind of the old way that things got done. So what's kind of your take on how those three kind of ways of working together are changing really the dynamics in the enterprise software space? It's definitely a different approach. You know, as I said earlier, open source is now not really a nice to have. It's a prerequisite. It's the way things are developed in the enterprise software business these days. But you're right, in terms of how we see this evolving, will there be acquisitions? It is challenging for startups to, where they're great at innovation, there's no question about that, or they're not always so great as it's rolling that out to a larger enterprise audience. So that's one of the challenges they have. So if you walk the floor for Duke World, there's lots of vendors, there's dozens and dozens of startups out there. And some of those will have some nice exits. One or two, I think, might even be around in five or 10 years as a standalone, independent company. But I think you're increasingly gonna see the big guys play a role where you're gonna see some of that acquisition because they have the experience in terms of rolling this out to the enterprise and they have the channels and they have the sales and distribution, et cetera. You know, but that said, you mentioned innovation and that's the wild card in all this open source. So you know, in previous markets, you see the startups come in, they do a lot of innovation and then, as you mentioned, the big guys come in and when the market starts to mature a little bit and they see an opportunity, they pick off the winners and then the rest kind of die off. But the challenge there is, a lot of time the innovation stops at that point. The wild card in this market, which is why I'm really excited, is that you've got the open source component. So even as you do see some acquisitions, I don't think that's going to impact the innovation much at all because you've got the open source component, which is not just, as you mentioned, vendors, but practitioners are actually developing a lot of the technology themselves. So it's the big companies, the big web companies, of course, Facebook and Google and increasingly it's other, there's another wave of those kind of companies like Netflix, for example, who's open sourcing some of the tools that they're building. So the innovation, I think, is going to continue because those kind of companies, a lot of those companies don't buy from the big enterprise practitioners, they want enterprise vendors, they want to build it themselves because they want to move fast and it's so core to their business, that it's worth them investing in developing those tools themselves. But the interesting thing, the way they monetize those tools is not the tools themselves, it's how they use them. So they don't mind open sourcing them. It's a real right of head. Our advantage is how we actually use this. We're not giving that away, but we'll open source the tools and then as that cycle continues, startups will emerge to commercialize those tools and the innovation, I think, will continue in this market, which is why it's exciting and why I think it's very different from some of the past markets we've seen. Yeah, and the other thing I think you could see and Jane brought it up from WANDISCO is in the open source world, these individual contributors are rock stars in their own world. They're passionate about what they do, they're passionate about the software they develop, the problems they solve, and you get in these worlds and these guys are really excited about developing the software to do good, developing the software to solve problems. Not really commercial. Motivation for the individual contributors on a lot of these open source projects. So the way that the big companies have to figure out how to live within that world, support those guys' mission, give them the feedback that they need, and then still figure out how they are gonna wrap their additional product services extensions on the open source, I think it's pretty fun to watch and we're pretty excited. So we've got a great day today. Again, we've got Suny Madra from Pivotal coming up. Next, Moni Chabra from Cloudwik is coming on. John Cardente from EMC, it's like to have him on. He's been on a number of times. Kiotta Tamira from Treasure Data, which is an interesting company. They're doing some good things. Brett from Wendisco, Ryan Peterson from EMC, and again, Tom Riley from Cloudera. Excited to have him on. I think it'll be his first time on theCUBE. We've had Mike Olson on lots of times in Armour, but this will be the first time for Tom. Michelle Chambers from RapidMiner and she's doing a lot of stuff about women in tech, which of course is a passion of ours and women of big data. Excited to get her on for an update. And I think we'll get Clint Sharp on as well from Splunk. So we've got another full day, Jeff. I don't know, you ready for the full day? I'm ready, let's do it, three days in a row. All right, keep it going. All right, great. So that's Jeff Kelly. I'm Jeff Frick, we're here at Big Data SV that's running concurrent to Stratocomp in San Jose, California. Thanks for watching. We'll be back with our next guest after this very short break.