 The Cube at Hadoop Summit 2014 is brought to you by Anchor Sponsor, Hortonworks. We do Hadoop. And headline sponsor, WAN Disco. We make Hadoop invincible. Welcome back. You're watching The Cube. We're live here at Hadoop Summit. I'm Jeff Kelly with Wikibon. We're wrapping up day two of three days of coverage here at the conference. I'm joined by my colleagues, at Wikibon, Stu Miniman. Welcome to theCUBE, Stu. Hey, Jeff. Glad I could stop by. Of course, lots of activity going on in the valley always. So exciting to stop by Hadoop Summit, my first time. Well, yeah. So you've been here for all of about 30, 45 minutes. So what's your impressions of the show? Well, you know, at Wikibon, we always, like, real-time analysis, Jeff. So I was scouring the show floor, really understanding updates on virtualization, what, like, VMware's doing with their big-date extensions with True.os coming out. Talked to a bunch of the infrastructure players, looking at cloud, and, you know, let's talk about what, you know, that's having a huge impact on, you know, big-data infrastructure. It's a good point. We haven't talked too much about cloud today. And, you know, we were conducting a survey at Wikibon, as you know, around big-data deployment models and among other things. And one of the interesting things that we got back were that 56% of practitioners who've deployed big-data in some way, some form or another, are using the public cloud. 26% are planning to in the next six months. So that's, you know, 75, 80% of practitioners. Talk a little bit about, from your perspective, where big-data fits in the larger cloud discussion. Is it a good fit? So, you know, absolutely we think the intersection of cloud and big-data has huge potential. As, you know, Wikibon, Chief Analyst Dave Vellante says, big-data actually gives the cloud something to do. So, you know, for certain environments, especially if I need to have a massive deployment rapidly, you know, the cloud is great for that. The challenge, of course, is always that balance of what am I going to rent versus what am I going to buy? And so for, you know, small, temporary deployments, clouds make great sense. But from most people I've talked to, you know, cloud is not, you know, the primary deployment model today. People are buying, you know, bare-metal, you know, machines, or they're, you know, looking at infrastructure, and most of the environments are relatively small, you know, kind of 30 to 50 nodes, you know, not the, you know, thousands of nodes that the guys like, you know, Yahoo and some of the others here, you know, would be doing. Well, I mean, I think that cloud has a big role to play in big-data as this market matures. I mean, there's some factors limiting the growth of something like Hadoop and the cloud around privacy concerns, regulations, things like that. But when you think about some of the benefits of cloud and how they could be applied to big-data, to be the elasticity, you know, that is one challenge when you're going on premise with a traditional deployment for Hadoop. You know, you're still going to buy that and provision the hardware. And, you know, it's difficult to scale up quickly if that's your model. If you're using a cloud environment, you can scale up, scale down. Yeah, so actually the tougher second, because, you know, in the infrastructure space, you know, we don't want to have to worry about buying the machines and doing it. So, if I want to move fast, if we put it in virtual environments, that's going to speed our time to deployment. And, you know, virtualization has spawned, you know, lots of applications, and they are making progress with what they call big-data extensions. You know, originally Project Serengeti. And, you know, people look at two things when they're looking at virtualize. Number one is I'm worried about performance. And that's been something that VMware has gone application-by-application from, you know, through mission-critical apps and they're fighting with big-data to prove that even virtualize, they can provide that form. And secondly, architecturally, because when most people think VMware, they think kind of traditional infrastructure. I mean, you know, not to besmirch, you know, the typical, you know, server and storage guys, but my price per node was more expensive than what, you know, Jeff Hammabucker and the Hadoop guys, you know, wanted to create. However, there are players creating a new structure. Many of them leveraging virtualization that give great price points, leverage things like Flash. On the Wikibon site, we've talked about an emerging architecture called ServerSan that could potentially in the future fit this need. Happened to, you know, one of those ServerSan companies, Nutanix, is here. And, you know, to say that when they've got, you know, 30 to 50 nodes, that fits great into one of their environments. So, you know, they're not the same solution and the channel that sells that type of solution, they're not the same ones that know Hadoop. So, skill set is a major impediment to getting this environment. Because, Jeff, as you know, you know, the infrastructure guy and the guy that did, you know, BI and is now looking at Hadoop, those are totally different worlds. And one of the things I love is when you and I get to talk to these guys and help them, you know, in order to get the most out of both technologies, there has to be better communication and not look at, we have to break down these silos, both the data silos and the way people operate internally. It'll be really interesting. Like I said, I haven't heard a lot of cloud discussion at this show and that's somewhat understandable. We're talking about, focused on Hadoop and vast majority of those deployments are on-premise and traditional scale-out direct-attached storage environments. It's understandable but it will be interesting and VMware is here and some others and they have a plan moving forward to spur adoption in the big data space. You talk about performance, there's concerns about performance in traditional environments. You add the virtualization aspect and that just makes it more complex but the value proposition is there if they can get the performance that people are going to require. We're talking about going real-time here. We're hearing from customers that are on the ground, have built applications that are processing data and getting real-time insights so performance is increasingly very important. Jeff, we've seen the cloud guys are really looking at performance even more when we were at AWS re-invent last year. They are really increasing the use of flash, giving instances that can give much better performance. I stopped by and talked to Rackspace and Microsoft and they've got a big push to try to get people to leverage or public cloud environments for that. Definitely lots of upward potential for cloud and big data. It was a great day today. We talked about a lot of interesting use cases. I think the big topics, the big takeaways today were all around enterprise-grade capabilities. Security was top of mind in the last couple of days. Certainly high availability, things like performance, backup and recovery. I think the mood here, as I've said today, there's that excitement you would expect from an ecosystem like this, but there's a level of seriousness that maybe it wasn't here in the past. I think that's because of some of the topics we're talking about. Another great day in the Cube, thank you for watching. We're going to wrap up day two here. We'll be back tomorrow for another day of coverage. Make sure you tune in. We'll be starting around 10.30 a.m. Pacific time. Catch us then and we'll have another full day of coverage. Thank you.