 live from the Julia Morgan ballroom in San Francisco. Extracting the signal from the noise. It's theCUBE, covering Structure 2015. Now your host, George Gilbert. This is George Gilbert. We're back at the Julia Morgan ballroom in downtown San Francisco at the Iconic Structure 2015 conference. And we have a special guest with us today, Joseph Seraj, Corporate VP of Machine Learning and Data and all the interesting things that go along with that. Joseph, welcome. I should be here, George. So, you gave a very intriguing talk a little earlier today. I wanna sort of step back a little bit and set context around that. Hadoop, as we all know, is sort of the shiny new toy and everyone's got fear of missing out, FOMO. So, but they're also, it's taking like two years to get from proof of concept to production. And many are choking on the complexity. Can Microsoft help with that? Yeah, great question. So, the complexity, simplifying that complexity of these is what we actually are trying to do with the cloud. So, in the cloud, we can have Hadoop as a service. It's fully managed. You don't have to worry about standing up clusters, managing it and managing version control and patching and staying up with the whole innovation. We do that and offer a stable environment for people. You can provision these Hadoop clusters in minutes. You can put up applications on top of it and run. And that stability is an incredible part of making it usable. Now, we can also take a lot more of a futuristic step. We can make a lot of the Hadoop-like functionality as a service. So, Azure Data Lake, for example, is an exabyte-scale file system in the cloud, but you talk to it through a Hadoop file system API, a WebHDFS API. You don't have to worry about setting up a cluster, managing it, just a service in the cloud to which you send big data. In other words, it's not a managed service. It's not like some guys who automate the sort of the administration. This is designed to run as a service. And it's a multi-tenant service. And it's an API that you talk to. So any application that needs big data and a huge bandwidth, you can just talk to that WebHDFS API and build on top of it. Very simple. Don't worry about the Hadoop cluster management for just getting big data. And then we have also a compute layer called Azure Data Lake Analytics, which is a Yarn++ layer. And you don't have to worry about a lot of that. Whatever analytics engine you want to run at scale, you can run on top of that resource manager and run it at scale. And we have a language that we call USQL that is sort of taken off from TSQL. It's like SQL language, but it runs at scale on that. And you can do all kinds of analytics on the big data that you have in the store. And it just simplifies the whole thing. Oh, is this based on the PolyBase? No, it is based upon something that we used to have internally called Cosmos, which was built inside of Microsoft before Hadoop. And we had invented a language for that, for big data. That was like SQL. But it had extensions that allowed machine learning and so on. Okay, okay. So that's the analytic layer on that data lake. And I assume then the machine learning your wheelhouse, how do you bring it to bear? In other words, how does it get delivered? How does it get consumed? Two ways. Now if you're a developer and you wanted to build a custom machine learning app, it should be as simple as creating a visual like workflow, data workflow that sets up an experiment, running a model, and then actually publishing a web service API out of it. The data is all managed in the data lake in the cloud. And you're just setting up the data transformations. And it just runs. And it's actually very easy to do. And then when you have the machine learning model developed, it runs in an API. It's cloud hosted. It can be easily connected to applications. It's very simple for an application to become intelligent. So the administrative overhead of embedding and updating and calling from within the application is kind of non-disruptive. It's non-disruptive. It's taken away. And so all that heavy lifting is removed. The cloud manages it. You don't have to worry about backcode comparability. Once you've built the API, it stays. And all the management becomes easier. And we put all of these things, by the way, all these different pieces, big data, in a bringing data in through information management, machine learning, dashboards, all of that into one suite that's fully managed in the cloud. We call it Cortana Analytics Suite. And that Cortana Analytics Suite, then, there are full sort of collection of services that are from one vendor that works very seamlessly together. Built, designed, built, tested, integrated, operated. One, unlike the 22 projects that ship with Purdue. And created by different people, speaks different interfaces, and integrating and managing it. The patches, all of that is so hard for an enterprise. And stability over a five-year time period, when you want an application to be stable, that's hard, right? And so, this takes away all of that. Microsoft, it helps you have all that. Do you see customers, when they sort of try to get into production, do you see them sort of hitting a complexity wall, both on the admin and development side, and sort of moving more in that direction? It's actually, yeah. So, it's dramatically simpler than managing the open source ecosystems and products from so many different vendors. The thing about the big data ecosystem today is that complexity of the 50 or 100 different things that are being innovated upon, and you don't know what to pick. You don't know what to connect up. You don't know how to actually build an application, and how to manage that over a five-year, 10-year period. And so, it's very much simpler in the cloud. And so, that's the one area. The second thing I talk about, by the way, George, is the API economy. I spoke about that today as well. So, I asked people, how many people wear a tailored shirt today? Very few people do. And what happened in the last 40 years? The thing is, yeah, is that a tailored shirt? So, the thing, though, is when mass manufacturing of clothing came into being, yeah, there was an incredible selection of clothes that were available in department stores. Already made clothes. You didn't have to go buy tech stars, get things tailored. Machine learning today is like tailoring 40 years ago. You go custom get the data. You custom get the Hadoop software. So, let me generalize from that. In systems of record, pre-ERP, a lot of that was custom built. And then we agreed on the practices, and they were standardized. But, let me take that a step further. That meant that we did not differentiate business processes, and therefore competitive advantage, based on what apps you deployed. And so, gathering that you're saying in the API economy, at least as delivered by Microsoft, it won't be about differentiation for the customer. Well, so what you will see is lots of developers using the mass manufacturing capabilities of the cloud to do intelligent APIs. Create lots of different intelligent APIs, maybe fraud detection, face recognition, optical character recognition, all types of intelligence, predictive maintenance, and offer it in a marketplace. And people will be able to build intelligent SaaS applications by composing them. And that will be the, by composing them, they will create lots of differentiation. So, your notebook may become intelligent for one way. Your elevators may become intelligent. Your farms may become intelligent. All types of hospitals may become more intelligent. But those applications are what will differentiate. Okay, so, last question. IBM, I talked to IBM and Cladera about their vision of differentiation based on machine learning. Cladera believes that like churn for a communication service provider, that's vanilla. But how to prevent churn would be customer specific based on their data and their algorithms. And the solution provider would deliver that and they would not share that across customers. IBM believes the data belongs to the customer, but the recipe, the secret sauce of the machine learning that they would share across customers. How do you observe that? So, I think, let me go back to the API economy. I think there'll be a fundamental collection of components, okay? Like for forecasting or for fraud detection or predictive maintenance. There's a number of templates that'll be available. Now, but if you want to do predictive maintenance for your application, you will take that API and use that with your data. And maybe data that comes from external sources from, you know, weather forecast or other thing. Blend that into the cloud. Some of it will be public data. Some of it will be your own data. But that API will help you build the application that you have, that you like. That's the way I see a lot of these things. So, some of the differentiation comes from adding private or non-shared data. And therefore, probably non-shared analytics on the side. It's like cooking. I mean, you'll have a recipe and it's like a bread maker. You'll have a bread maker that we have in the cloud that think about that as your API. And then you get your own dough to put into it. Yeah, if you want to make color. Yeah, and you have a lot of things you can innovate and you can make all kinds of bread with a bread maker. But a bread maker specifically for making bread, not for cooking stew, right? And so, you have lots of different kinds of dishes you can make at the cloud. And there'll be specific things for that. Okay, that's how I think about it. Okay, so it's not completely cookie cutter. It's not completely cookie cutter. It's based on a standard. You'll have bread makers and toasters and you'll have all kinds of things available to you. And with that, you can build unique kinds of recipes of all types. Okay, okay, yeah. Joseph, we're gonna have to leave it at that. This is George Gilbert, we're at the Julia Morgan Ballroom in downtown San Francisco at the Iconic Structure 2015 event and we'll be right back in a few minutes. Thank you.