 Live from Dublin, Ireland, it's theCUBE, covering Hadoop Summit Europe 2016, brought to you by Hortonworks. Now your hosts, John Furrier and Dave Vellante. Hello, everyone, and welcome to theCUBE, special presentation for our European adventure, our European vacation, our European Cube. This is the kickoff of the spring tour for the SiliconANGLE Media's theCUBE, where we go out to the events and extract the signal from the noise. This is a European edition for Hadoop Summit 2016 in Dublin, Ireland. And I'm excited to be here with my co-host and co-CEO SiliconANGLE Media with myself, Dave Vellante with wikibond.com, chief analyst, and again, co-CEO SiliconANGLE Media. Again, this is our European tour. If you're watching and you're running events in Europe, contact us, we're doing a full expansion into Europe. Greg Stewart, you can contact him and ping Greg Stewart on our Cube team or Jeff Frick if you're interested in more events. I see a lot all over the world. Dave, big data, Hadoop Summit. We've covered every single of Hadoop event since the inception of Hadoop World. And then when Hortonworks spun out of Yahoo, we've been covering the space from day one, present at creation for this big data revolution, starting with the Hadoop ecosystem, expanding out into the entire world now. Big data isn't just about Hadoop anymore, although obviously Hadoop Summit, Hortonworks, it's the core open source community that's driving all the innovation. Your thoughts here in Europe, smaller show than the USA, but again, the requirements are different. Data governance front and center, security front and center, carrier grade, enterprise grade, data connect platform. These are the conversations. This is the presentation. Your thoughts on the keynote this morning. Well, we're here in Dublin. I learned this morning that Dublin and San Jose are sister cities, but Herb Kunitz told us that's not why we're here. We're here for the Guinness. I'm all for that. And you know, good crowd. I mean, a lot of doers here, a lot of people who are actually using Hadoop, a lot of Hadoop practitioners, probably about a third of the audiences, at least some kind of Hadoop going on in production. Only about 10% of the people here were just here to learn because they really weren't doing anything with Hadoop. So we're starting to see that maturity. We talked about this at our research meeting the other day at SiliconANGLE Media. Hadoop's sort of reaching the adolescent phase. The Hadoop market's now 23 billion. It's growing at 14% compound annual growth rate through 2026 when it hits almost $100 billion. So it's a nice segment that's growing. The really interesting thing there, John, is most of the market is still not software. Mark Andreessen says software is eating the world. It's sort of eating the big data world, but it's all free. It's free. It's open source. And so professional services is still the dominant take. We expect by 2020 that's going to change as applications start to come into the four, packaged applications have been few and far between, frankly, in this ecosystem, John, and we're waiting for more of that. So that leads us to the conversation about developers, kind of your wheelhouse and what you're seeing in the developer world. Well, obviously, Dave, we saw at Big Data Week where we had our Big Data Silicon Valley event. Big Data Week is where Big Data Silicon Valley, Big Data New York is on in conjunction with Strata, do our event and O'Reilly's event. We saw that in the States where the cloud was a big driver in all this. And if you're talking about the cloud and developers, you cannot not talk about Amazon web services. Amazon web services, absolutely crushing it. Now forecast to be about $10 billion according to Jeff Bezos. And we predicted that in theCUBE years ago. You and I had that conversation and we were kind of like, laugh it, but look it, we were right again. The queue was always right. But if you're a developer, you got to look at Amazon. What is the cloud presence for you? And that to me is where everything's coming together. The confluence of what is in the cloud is going to thrive really rapidly. The innovation and acceleration of all the Big Data, native applications on Apache, all the way through the business intelligence, homegrown, purpose-built appliances, and software for the big companies. The cloud is where the action is. And again, you got Redshift, you got Kinesis, Amazon's moving in with services on these areas. This has been a success for Amazon web services. Now in the Big Data world, who can keep up? And again, I've always said this on theCUBE. It's like a NASCAR race. If you watch NASCAR, you know, everyone stays in the middle of the pack. Someone gets out front and they drop in and draft. And then someone slingshots around at the end. Of course, people hit the wall and crash and burn. But the bottom line is, this is what's going on now. You're going to see a whole another level of racing going on with, and this acceleration of the cloud will drive the Big Data application. And I see the developers really, really getting into their wheelhouse with integration and certainly the power of the cloud. This will determine the winners and losers of this next generation. Mile marker, if you will, for the Big Data world. And again, Hadoop is a critical ecosystem. Not the only game in town from a feature standpoint, but the ecosystem of developers will intersect with commercial cloud and commercial vendors. So Rob Bearden this morning, as Kino was really emphasizing business value, business impacts, which for years, of course we've geeked out on yarn and more recently Spark and obviously all the zoo, all the animals in the zoo. So it's interesting to note he was giving some examples of customers that John that wanted a 360 degree view of their business. He gave an example of a retail customer, another example of an insurance company. I tweeted out that those same companies wanted a 360 degree view of their business back in Y2K and they thought their EDW was going to help achieve that. So I want to better understand why people in this community feel as though Hadoop and Big Data will live up to that promise of a 360 degree view of the customer and find out if people are actually achieving that. What are your thoughts? Well, first of all, the 360 degree view of the customer is complete bullshit in the old world. Okay, I don't think that ever happened. Data warehousing and business intelligence just never was up to the challenge. Not because it was just a technology architecture issue. They didn't have a lot of data and they had limited data and they would use that data and fence it in, run reports against it, but they missed so much data. Now we're living in an always connected world. You heard Mark Zuckerberg yesterday at Facebook talking about connecting the world. We are living in an era day where everything's measurable. So the notion of getting new data doesn't make it mutually exclusive for the data warehouse business intelligent old guard like Terra Data Informatica. We heard from them last event we were at. They will have a nice little coexistence, but it's all this new data. The connected data from IoT, the big data that's coming out of the Hadoop ecosystem, this is the focus. And I think Hortonworks CEO Rob Beard nailed it when he said connecting the data is a core value proposition at the end of the day. You got the data, you got to connect, you got to run with existing systems and you got to produce business value. Again, back to the cloud. And when you bring in the cloud or on-prem, enterprise security and governance is huge. Obviously in Europe, that's a big conversation. Obviously different countries. You have a lot of different data from IoT now completely complicates the mix. I mean, imagine you're in your car driving from France to Germany with telematics and big data. Who manages the data? I mean, it's a nightmare of a scenario on the deployment side. But you cracked the code on that. Huge value, absolutely huge value. Well, you've always been down in the term data lake. I mean, essentially data lake is this big static bunch of data that's sitting in a lake that becomes kind of swampy, slimy. So now we heard a lot today, a lot of talk about data in motion, bringing life to that data. Like an ocean. Kind of like an ocean. Kind of like a data ocean if you want. You know I love data ocean. Data lake, I hate the term. I hate the term data lake, I think it sucks. Mainly because it doesn't represent what's going on. It represents a pool of data. Doesn't give the dimension of like, during currents, temperature, different dimensions of data with IoT, fast data, slow data, tides, rip tides, all that stuff as a better metaphor in my opinion. Well, but it actually in some respects does underscore what people are doing with data today. They are just kind of throwing it in a bucket or a lake. And sort of just trying to figure it out as opposed to what you're prescribing which is this much more dynamic environment. Well the ecosystem drives what's in an ocean like take Monterey Peninsula down in Cape Cod your area where we grew up. You have different ecosystems in those areas that are feeding off of the different environments. That's very much like the data market. You got Alchemy, you got all kinds of interesting big data dynamics. So the ecosystem of the application is the subject of its environment that it's in. And so I think the data, you can't just say data lake and pile it up. Yeah, certainly you got to have a data lake, pile the data somewhere and apply to it. But in a business value context, it's not driving that business value. So the other thing you mentioned was IoT. And we heard a little bit about IoT this morning and we've been studying that at Silicon Angle and Wikibon Research side. Edge computing, there's a lot of things we talked about this a couple of segments ago at HPE Discover in London. We were talking about some of the basic things that have to happen for IoT to take place. You pointed out, well, if you want to instrument the windmill, you better have connectivity there or it's really not going to be that useful. The other piece we're seeing is edge computing. You've seen Amazon make some announcements there. Microsoft was up talking about edge computing today. Certainly Rob Bearden talking about bridging all these data silos. You're not going to move the data. It's going to be done with computing at the edge. So a lot of infrastructure has to take place for companies to realize business value from that notion of Internet of Things. I mean, you mentioned the silos. The silos in the enterprise really limits your entire view of the entire data set. That's really been a key challenge for everyone. We've heard that on the keynote. We've been hearing it on theCUBE. That's why integration is a very, very big part of the new normal and the enterprise success formula for startups and existing companies. You got to integrate in. So whether you're running a big MPP database or a Vertica or a big engineered system purpose-built hardware and software, commercial software, two open source, they got to work together. The glue layer in all this is really, really important. The second thing Dave, I want to highlight is that, again, to reiterate what we say on theCUBE all the time, the importance of community really is huge. The ecosystems that are developing around the software developers and now with cloud accelerating the advancement of applications from a deployment standpoint really is about which community you're playing in. And with a great community, you'll see great, great advances in innovation. No community, there'll be no traction, in my opinion. Community has to be factored into all aspects of the developer ecosystem. You know, the other thing I want to bring up is some other data points. As we talk about disruption all the time, but for years, I've talked about the rather rich get richer. When you look at the big data space, we just released Wikibon's latest forecast and market shares on big data. I think our sixth year in a row doing that. If you look at the list of companies that are dominating that business, if you look at the top 10, there's only two companies that one would consider big data pure plays that cracked that top 10. Splunk and Palantir, you know, the very secretive company. That's it, those two. The rest is IBM, SAP, Oracle, HPE, Dell, Teradata, Microsoft. I mean, the big whales. So it's going to be interesting to see if these large legacy companies that failed to deliver on the promise of 360 degree view with enterprise data warehouse can live up to that promise. Or we're going to see new entrants. Amazon has probably, I mean, this industry's been in oligopoly now for a while. Amazon has started challenging that. But it's going to be really interesting to see if these players can step up and deliver on that promise. Yeah, and that's going to be ultimately data driven. That's going to come down to how open you are with the data, how you're handling your application development, and see what happens. So, super exciting. Again, at the end of the day, actionable intelligence, breaking down those silos, having real impact with data for the applications. And whoever gets it done, does that first. We can talk about Splunk. They have one early example. They put the data to work immediately, and that's going to continue to be the benchmark as we look at the scoreboard, Dave, of who's out there. So, we're going to be breaking it down here live. Here in Dublin, we're in Europe for the SCUBES, special presentation of Hadoop Summit 2016. Follow the hashtag HS16 Dublin. Go to crowdchat.net slash HS16 Dublin. Join the conversation, be on the record. We'll be following all the trending stories, following all the influencers. We'll bring that commentary. You have any questions? Just go to crowdchat, put the questions in there, and we'll bring them on the CUBE. Thanks for watching. We'll be right back with more here live in Dublin after this short break.