 Live from Midtown Manhattan, the Cube's live coverage of big data NYC, a Silicon Angle Wikibon production. Made possible by Hortonworks, we do Hadoop, and when this go, Hadoop made invincible. And now your co-hosts, John Furrier and Dave Vellante. Hi everybody, we're back, this is Dave Vellante. We're live here in the Big Apple at the Warwick Hotel on Sixth Ave, right across the street from the Hilton where Hadoop World is going on. This is the Cube, we go out to the events. This is our fourth year at Hadoop World. We extract the signal from the noise, bringing the best guests that are at these events. Horan Oberoi is here, he's the director of product management for BI at Microsoft, and Bob Page is VP of products at Hortonworks. These guys are big partners, and we've had a number of guests from each of your companies on before on the Cube. Welcome. Thanks for it. Good to see you guys. So Horan, let's start with you. Microsoft, you're in every market, you can't name a market where you're not, even hardware now. Yep. So what's the bumper sticker on Microsoft's big data play? Sure, so the big thing for us is, we've always been about simplifying things, bringing things to the masses, and so the big theme for us this morning was how do you bring big data to a billion people? And one of the big things we're looking at is, A, how do you simplify things within the enterprise? And B, really, how do you get the insights, ultimately, from the data to the individuals that need them? And so a lot of the investments we're making, for example, in Excel, the investments we're making in natural language search, in addition to the investments we're making in Hadoop, a ways in which we want to connect the dots between the people and the data to bring this to the masses. Yeah, so a lot of the sandbox activities that you see in Hadoop are just very narrow, and in typical Microsoft fashion, you bring in a lot of different tools to the problem. How's that going? What's the reaction been in your customer base? Sure, yeah, and there's really two pieces to the story. One is the work we've been doing with Hortonworks, of course, which is how do we simplify Hadoop and bring Hadoop to the enterprise, and that includes the investments we're making in security and the investments we're making in manageability. Of course, a lot of the work we've done together has been porting it to Windows. And then the other part of the equation is about bringing it to the users, which is the investments we're making in the BI side of things with Excel. And so we've seen some great reactions on the integration work we've done with Excel, and we've got some demos, we've got a booth back in the conference, and everyone who walks in and sees big data showing up in Excel and a really rich 3D visualization looks at that and saying, wow, I didn't know Excel could do that, so we're getting some great reactions. So you mentioned the Hortonworks partnership. You guys decided years ago not to do your own Hadoop distribution. What was that conversation like back then? Was there a debate about that, or was it a no-brainer? Well, initially it was a little bit of a debate, but it got to a certain point where Hadoop did become the de facto way to store and process, distribute the data. And at that point, which was about two years ago, it really was a no-brainer, and we made a bet and we partnered up with Hortonworks, and here we are two years later announcing the GA of HD Insight on Azure, so it's a beautiful thing. So that gave you guys a big tailwind, because it sort of removed that last big question mark in the marketplace, like, okay, what's Microsoft going to do? And then once they joined forces with Hortonworks, it seemed like the roads were paved, you know, the snowstorm, right, and the plows came through, and then boom, things really started to take off. Is that a correct perception, or I wonder if you could take us back? I like your analogy. It brings to mind many other possible analogies, where in some ways, you had Hortonworks, which was kind of the engine, but you didn't have some of the infrastructure, maybe the front plow you needed to clear out some of that, some of the snow. But together we're able to do that. And we're now at a place where all the projects, all the core projects and most of the associated projects are not just Linux and then port to Windows, they're both simultaneously. And so we saw that just recently with Hadoop 2, the core which shipped to GA a couple of weeks ago. And then the last couple of weeks, we saw a pig and HBase and a number of others that are now not just Linux native ports to Microsoft Windows, but are Linux Windows simultaneously. So Hortonworks obviously very proud of its contributions to Apache, maybe give us a quick update on that. And I'm very interested in Microsoft's role there. How do you guys collaborate together? Maybe start with you, Bob. Well, I mean, our commitment is as strong as it ever has been. Everything we do is 100% open source. Every piece of code we write goes back to the Apache Software Foundation. We don't own the IP there. We want to make sure the community owns it. That's the only way that we know of that's going to create this ecosystem for Hadoop, this platform that is going to de-risk any of the investments that businesses will make. And you won't have multiple Hadoops. You'll have one Hadoop and that'll be the one that Apache has. So we're dedicated to that. It's going quite well. And Horan, talk about Microsoft's stance on open source. Sure, yeah. I mean, this was a relatively simple decision for us. We had a common goal here which was bringing Hadoop to Windows. And when we looked at what's the best way to bring Hadoop to Windows, it was to partner with Hortonworks and then contribute back into the Apache open source community. So ultimately, we did want to have Apache Hadoop for Windows. I mean, there was a whole half of the market that didn't have access to Apache Hadoop that we wanted to enable that. And that then allows us to bring other things to life like the Excel integration and natural language search and things like that that we talked about. And so a lot of the work we did hand in hand, I mean, really putting engineers together for a long period of time was to help port Hadoop to Windows. And then as we do that, we're making Hadoop better on other projects like Stinger and Tez and then making Hive more performant. All those things make Hadoop better and that benefits Microsoft but it benefits everyone as well. You guys have massive resources, huge product portfolio. So what do you do? How do you sort of decide where to put, I mean, you have massive resources but they're not unlimited. So how do you decide where to put the resources? You look at your portfolio of products and say, okay, where's the biggest potential? How can we help the community the most? And then what? You actually provide committers, contributors, what role do you actually play? I mean, of course, we do want as many Microsoft employees to also be committers. That's a process that takes time and it takes the more time you invest, the more committers you end up getting. And so that is part of the reason why we did partner up with Hortonworks because Hortonworks had been doing this, understood the process really well. And so when we started two years ago, we were learning the process from Hortonworks. Now it's very much a hand in hand process where we're all contributing to the same code base and then it's just a function of which individuals are contributing more and committing and getting the specific contributions passed by the community. So let's talk about HD Insights and Excel. And I presume this is how you're gonna get to a billion users. So big data for a billion users. Where'd that come from? What role does Excel play? What role does HD Insights play? Sure, yeah. So if you think about just some of office users on the planet and there's over a billion office users and giving them, ultimately the value of big data isn't just what you can do in Hadoop. It's the insights you get from the data. And those insights shouldn't just be limited to a small group of data scientists. Like we really want those insights to be limited to a much broader set of end users. And so the work we're doing to integrate, make Excel a first class BI tool and then integrate Excel with Hadoop and make the movement of Hadoop data into Excel as well as the BI infrastructure behind it. As the more we simplify that, the more we believe we can start to make this accessible. And what role does Hortonworks play in that whole ecosystem? How does the integration work? What role do you play in specifically in that integration? Well, we have regular engineering meetings, roadmap meetings, et cetera. So where we're meeting regularly, we're working together, we're coding together, designing sort of what are each of our customers hearing the enterprises and how do we best meet their needs? So it's a true partnership, not just to go to market, let's put some marketing dollars behind it, but we've got folks flying back and forth and writing code and doing reviews and it's a full-on partnership. The resulting effort really is, HTTP on Windows and then HD Insight on Azure really gives you that one-two punch in terms of deploying Hadoop on premise and then deploying Hadoop in the cloud. And regardless of what kind of deployment model you have, whether it's one or both, a hybrid, being able to use Excel and BI tools against either one in a consistent way, that becomes a lot more simplified when you have people working together like that. So everybody's familiar with Excel, right? So it's a comfortable reference model. You mentioned before trying to make it more, BI robust, at the same time, you have a base of users that is very comfortable with it. I mean, I'm comfortable with Excel. It scares me a little bit. I say, okay, I'm not sure. I do want to get into the big data space, but I'm not sure I'm ready for BI. Excel's one of those beautiful tools where it let you go as far as you want to go. So depending on what your level of comfort is and how far you want to go, I mean, you can start small and do easy things, but as you progress, that progression should be simpler. One of the things we'll start to do, and you'll see us do more, is as well as provide cloud services like Office 365 and what we're calling Power BI for Office 365 to light up Excel. A great example of that is if I'm in Excel and I want to search for how many attendees visited Strata last month, I could actually in the context of Excel do a search and if that data happens to be public and we index it in the Bing search, then that data shows up in the context of an Excel and you just pull it in and it'll figure out how to put it in rows and columns for you. So examples like that where you can take simple idea like search, connect that to external data, integrate that into Excel, so someone like you who's using it in context can work with it, things we want to do to make it more accessible. So let me make sure I understand this. So you're sort of automating the Excel schema, if you will, using whatever, semi-structured query, pulling in data and then organizing it in a way that at least gets me started. Yeah, and we're using the power of cloud in Office 365 to light up what you can do in Excel beyond just some of the traditional experiences you might have had. So as Office 365 becomes more popular, as Excel starts to connect better to the cloud engines and services, the types of things you can do will become easier and more powerful without a whole lot of effort being put on behalf of the user. With a collaboration component, obviously, included as well. Excellent. So what's going on over at the show at Hadoop World? That's great, we've got a booth, we've got a great visualization showcase, we're showing some of our great new tools like some of the 3D geospatial visualizations that you could do in Excel if you didn't know that. So we're showing some of those things. And then of course we're talking about HD Insight and the GA and the work we've done there in terms of security, manageability, and then just the integration with our BI tools. And so we had a session earlier this morning where we showed a customer, it was a city of Barcelona that's using big data to make city services better. And so we're doing some real world showcases as well. So lots of exciting things going on. How about the Hortonworks booth? What do you guys get going on over there? Well, we're showing the HDP2 and we're talking a lot about HDP2 which is based on Hadoop2. So that's our yarn-based distribution of Hadoop. We firmly believe it's going to change the world but we're not stopping there. We've got Tez coming and we think that's going to be as big a change in Hadoop as yarn is going to be in Hadoop. So we're just talking about that. We've got several speakers, sessions going on where we're training folks and letting them know what we're doing. Again, we're doing everything in the open and come by the booth and check it out. Yeah, we had a rune on earlier, talking yarn, he said, I feel like I've been in a cave for two years it just came out. Yep. Sun's still rising but things have changed a little bit All right, gentlemen, thanks very much for coming on theCUBE. We really appreciate the insights. Congratulations on the partnership and the momentum going forward and good luck getting to that billion. Great, thank you. Keep right here, everybody. We're right back with our next guest. This is Dave Vellante. We're live at New York City. This is theCUBE.