 Live from the San Jose Convention Center, extracting the signal from the noise, it's theCUBE, covering Hadoop Summit 2015, brought to you by headline sponsor Hortonworks, and by EMC, Pivotal, IBM, Pentaho, Teradata, Syncsort, and by Atunity. And now your hosts, John Furrier and George Gilbert. Okay, welcome back everyone, live here in Silicon Valley. Day three of Hadoop Summit 2015 is theCUBE. I'm John Furrier with George Gilbert, our big data analyst at Wikibon.com. Our next guest is Eric Schmidt, with Google, not the real Eric Schmidt. Eric Schmidt, we had on a big data SV, just a few months back, product manager at Google. Welcome back to theCUBE. Good to see you. You get that all the time, right? Not the real one, the fake one. My mom says that I am a real one, but not him, so yeah. You're the real Eric Schmidt, as far as, the other Eric Schmidt hasn't been on theCUBE yet. We tear him apart. No, I've known Eric Schmidt for a while, he's a good guy. So, seriously, get back down to business here. What's going on? We only have about 10 minutes, quick before George. I know George wants to drill down, but what's going on at the show this year? What's happening this year? What's the key themes from your perspective? What's the vibe? What's the hallway conversation? What are the sessions like? What is the conversation happening here? I kind of see two things bubbling up, like on our side we're seeing, and ours as well as other cloud providers, making strides, good strides towards delivering on fully managed services for different types of disciplines for data processing, whether it's queuing technology, whether it's high throughput databases, whether it's scalable data processing systems. And that was, the focus of the talk that I gave today was, hey, we made really, really good progress on data flow, but at the same time, BigQuery continues to advance, Pub's a big table, et cetera. And then, the rest of the non-managed space and the Hadoop ecosystem just keeps doing explode. Like I walked into the floor and I'm like, wow. So what breakthroughs happening? I'm just not seeing the, I'm not falling out of my chair on the analytics side, so I want to ask you a question on this. Is the tide coming out, is things a little bit sideways on the analytics side as the wave of cloud comes over the top? Because we had a couple of people on theCUBE say, including myself, that analytics has been, DevOps has been waiting for this killer app, so cloud seems to be the powering engine, meaning analytics is doing its thing, is chugging along, but yet now the cloud and the DevOps ethos for app developers seems to be coming on strong as an undercurrent in this conversation. Yeah, and the two areas are really, the two areas of innovation, one, it's driving down time to answer. Because yeah, analytics continues to move forward, but you're kind of gated by how fast and what quality can you process data. So your time, the answer isn't driving down in your analytics or getting more and more pressured, especially as business continues to grow. And more, there's more application developers that are doing in the cloud apps don't want to write more software. They really want the DevOps to kind of move along faster so they don't have to write more software. Well, that plus the amount of talent that companies are hiring like five years ago, the concept of like a data scientist inside of a company was just like that. That was a very unique boutique idea. Now you have multiple data scientists and they're just data analysts, but they have a data processing background, a really, really solid mathematical background. So maturity of the market's happening. Yes, so that's maturing and then the underlying infrastructure. And I have this Venn diagram, basically data processing in the cloud. You know, these two worlds have kind of sat like this and they're starting to coalesce where they need to. And it really is coming down to removing as much DevOps out of the equation so you can drive down cost and bringing as much innovation so that you can drive down time to answer. That's great, great description. So, okay, let's key in on like the managed services, because one of the themes that we keep coming back to and you said like you walk out on the floor and see a lot of innovation, but the other side of the coin is that there's a fair amount of complexity. Apache's great governance model for individual projects, but Hadoop, some people say they're choking on the complexity of administering it, developing all the different tools. What is Google doing to substitute or to present an alternative for that? Yeah, our strategy is twofold. So a core investment around the Hadoop ecosystem is to ensure that Hadoop runs as best as it possibly can on top of Google Cloud infrastructure and where we can extend the developer operations model to help reduce cost. And so we have a whole team that's focused on, in essence, managed Hadoop workload. That's a sister team of mine. The other investment that we're making is developing and delivering on a fully managed data processing platform that has Hadoop-like qualities. So people think like, well, I need to do MapReduce. I need to have some type of stream-based engine. I need to have a programming model for it. These concepts are being codified into a fully managed environment that enables really a developer to solely focus on development. You're no longer focused on the complexity of like, okay, I need to deploy the Hadoop cluster. I need to manage it. I need to understand. Ops are gone. Yeah, ops are nearly gone. You still have the concept of understanding your data and its input and the volatility of it, et cetera, so that you can make changes as a developer, but the majority of that DevOps is gone. So tell us for the customers who are struggling at the boundary of the complexity of the traditional Hadoop software, what are the component services? How do they fit together? The component services that they would see is really one. An endpoint where they would submit a job and say I would like you to execute the computation that is inside of the graph that I developed with the SDK that we provide them and we handle the rest of it. So we handle the deployments, the lifetime management of those resources, the layering of their code onto all of the workers inside of that cluster, the scheduling of the work, the monitoring of it, self-healing of it, the auto-scaling of it, redistributing hotkeys. In a case where you may have a cluster and you have some key that's stuck and the whole job's slowing down, we manage that for them. You mean like even if they do the data modeling wrong and if they key their data so that it all backs up on... Well, that's one of the nice benefits of the system is that you don't have to key your data upfront in this fully managed environment. So what's the name of this service? The name of the service is Cloud Dataflow. That specific feature is what we call dynamic work optimization. But this whole thing, Cloud Dataflow, you can go in and say, here are all the transformations I want to do, I want to build this pipeline, you go take care of it. Yeah. Eric, I got to ask you about, we've got a couple minutes left. I want to just get you taken on. What's going on with Google Cloud, Cloud Platform, with thoughts on things like containers, Kubernetes, some of the coolness going on there. How's it all coming together? What's your outlook? I mean, certainly at the OpenStack Summit, we saw a lot of activity in the open source community around large scale, and you guys are already operating at that level. Yeah, and to be clear, that's not my sweet spot. There's a whole other team that's focused on containers. It's okay, you can speak to them. But I can tell you this, the work that we're doing around our managed service is highly containerized. So we've also benefited from adopting a container mindset and the concept of managing virtual clusters. And it's something definitely that we're getting strong demand for customers, whether they're bringing, like say, hey, I have an existing set of pipelines, a sitting set of transforms, and oh, by the way, I have a whole set of containers that I want to integrate into my overall data processing model. So we're fully on board with that type of architectural approach. What's next for you right now? What are you working on? We're going to go generally available as soon as possible. So the last time I was here, I think we were in alpha, we went to Strata and Brussels, we went beta and we are working as hard as we can to get generally available. All right, Eric Schmidt with Google here inside theCUBE, sharing what's going on with Google here at Hadoop Summit 2015. We'll be right back after this short break. Thanks.