 Live from the Julia Morgan ballroom in San Francisco, extracting the signal from the noise, it's theCUBE, covering Structure 2015. Now your host, George Gilbert. This is George Gilbert. We're at the Iconic Structure 2015 conference, downtown San Francisco. We are privileged to have with us Derek Collison, founder and CEO of AppSera, and with quite an illustrious past at VMware, Google, and Tibco. Derek, good to have you. Yeah, thank you. I always love coming on here. So let's start at the very, very big picture level. The state of the industry right now, in terms of analytic platforms, is Hadoop 2.0, you know, where we have HCFS and Yarn. But there are, you know, beyond like Goldman Sachs and Twitter and LinkedIn, a lot of other people are sort of having trouble sort of swallowing the complexity, both at the admin side and the development side. What might that begin to look like as we go to the next generation? And I don't mean to constrain that by saying it has to be 22 Apache projects. What are some of the alternatives that we might see? Yeah, I mean, so it's a great question and I was fortunate enough that at the time I got to spend at Google, see a lot of what is now manifesting itself out in the enterprise ecosystem, which was, you know, the initial map reduce and doing logs and let's do everything else. Then let's slap SQL on front of it. Now let's put it more in memory and you see all of these parallels in the industry. And eventually Google said we're going to kind of do a do over. What I think right now where we are, there's this notion of whatever Hadoop 3.0 becomes, you know, the evolution of Spark. The automation of setting these things up. One click, you spin this thing up, maybe using containers, orchestration, whatever that is very simply and you tear it down 10 minutes later. I think what's even more interesting though is that as we still struggle with figuring out how to get the data into it, making sense of the data, spinning up the backend systems, I think there's something coming up in our rear view mirror very quickly that's going to pass us before anyone knows what's going on, which is machine learning is eventually going to get so good so fast. And I mean literally in less than 24 months that no one's going to be talking about big data in two years. They're going to be talking about force feeding data straight in like a fire hose into this thing, which can reason about it, understand it and then spit out, you know, patterns, predictions, correlation, causation that we could have never even understood. And I think because these technology cycles are compressing so hard, and our brains are built linearly, we can't see this, I do believe that the notion of Hadoop 3.0 will just simply be, we're not even going to bother with that, right? We're just simply going to plug our data into the Google Brain project or you know, thing coming out of Amazon or IBM's or Watson or whatever that is, we don't have to operate it, worry about it, you know, we just simply pump data into it and we get amazing amounts of value out of it. And I truly believe that's going to happen faster than people think. That's a rather profound insight and profoundly unsettling. Well, I mean, look, a lot of people who know me know I love Tesla cars. I've been a big Tesla fan ever since day one. And the amazing thing about where they are with, you know, autopilot is all of the stories that came out and said, oh, it's doing the wrong thing. And all of the same people coming back two days later say it's already learned not to do that. They're doing that already. They're already doing that. The car is learning not to dive off on the exit or slow down around a curve. And if you look at the amount of sophistication and investments that like Google's making, they open source TensorFlow because they want that to continue, it's all about the data. And so if they run a service, the service is amazingly powerful, but until you put data into it, as a consumer, right, as an enterprise, and that for me, if that's proven out, sounds a lot easier than asking the IT department to set up a Hadoop 3.0 cluster, get it up and running, figure out how to pump all the data into it. It's like, no, just point it over there. Okay, so let me, then if that's either 3.0 or 4.0, let me relate like an example that I heard that's sort of on that path. So I talked to Tom Riley from Cloudera the other day and he said, we're not gonna wanna repeat the secret sauce in terms of machine learning that we or our partners provide to a communication service provider. For instance, we'll give them the recipe and the ingredients to do fraud prediction, but the prevention part, they get to keep what they build with us, they keep the recipe and the data. And I guess what you're saying is, it's almost like there's not a lot of hand tooling to build that last part. Is that, would that be a fair way of saying it? I think the interesting thing is, is that there's still tooling opportunities, there's still capital investment opportunities for companies that have invested in big data. But I think we can be honest, fraud detection is, it's, we can't have false positive. I get called all the time, blocked, blocked, blocked, you have to call and okay something. And that's just not a good customer experience. And so they already have the algorithms. It's, they need better understanding of patterns causation, correlation to say, no, it's okay, when you swipe your credit card, when you got off the plane in London, it's okay, we know it's you, we know you're there. And where do we think that's going to come from? Okay, so along those lines, my impression is that we will always be adding additional data feeds or analytic data feeds to improve the context of those decisions. And that's a manual process. I mean, you can't tell Watson, look at all the data feeds that are out in the world and figure out which ones are relevant for improving the false positive ratio. Yeah, absolutely. And right now the big problem is, is how do you model the data correctly to put into these systems? Whether it's big data or some of the machine learning stuff. What I'm saying, and I probably am wrong, but the ability for them to auto figure that stuff out is coming faster than we think. And so- Auto figure out like the new data streams to- What to do with them? Automatically. So, again, very constrained problem, just hinting at what's going on, but the Google Brain project around identifying cats all on its own, that should be a light bulb moment for people to say, well, wait a minute, I didn't have to teach it what a cat looked like. 98% of our learning, even as children, is not supervised. It's not us going to school and being taught, no, two plus two is four, not five. It's literally all unsupervised. And so, again, it might sound outlandish, and I probably am wrong, but my gut tells me this wave of computing and where we're going is coming up so fast that nobody's going to predict how fast it blows by us. Okay, so now let's bump that up a layer. How does that change applications that get built? I mean, do applications get built by machines, or do people now use those predictions as a sort of a composable gallery to build the applications? I think it's the latter. So it's the applications become a composable gallery, and we've seen that even without machine learning, just if you want to build something faster, build less, and assemble the pieces, right? So all the services that Amazon does, that's amazingly powerful to an enterprise who's trying to go to the cloud because they don't have to build and stand up a database, key value store, storage, whatever that is. I think that trend continues. I think there is value in the applications and the business value to the developers, but I agree, it's just composable services. And one of the big, you know, I think the cloud wars are being waged on a services ecosystem front, and it's data services, including big data, human machine learning, you know, and then human machine interfacing. And whoever comes up with the best class of services, I think is going to win a lot of the- The elaborate on the human machine interfacing? Cortana, Alexa, Siri, things like that. Got it, got it. And so where's AppSera's role? Is that in pulling these together and putting the rules around how to deploy them and consume them within the guidelines of a particular organization? Yeah, exactly. So as everything becomes more and more complex, the notion of how do you trust all of these pieces that are moving around? Obviously, let's say we do get in two, three, four years this notion of pumping data into a machine learning algorithm and getting amazing amounts of insight for our business back. Who's allowed to access that? Who's allowed to write new applications to actually consume this and make better decisions? This world of microservices where, you know, Docker and others on the orchestration side are driving the decomposition of software systems. That's great. That increases complexity, increases risk. Who wants to take on that risk, right, of what a workload's made up of? Who, you know, it's talking to, what it's allowed to do? Those need to be driven into the platform in this new landscape in even a more impressive fashion, in my opinion. Okay. All right, Derek, we have to stop there, but it was, as always, profoundly insightful. It's good to see you again. So this is George Gilbert. We are at Structure 2015. Another amazing guest, Derek Collison of Epsara, and we'll be back in a couple of minutes.