 Live from New York City, it's theCUBE at Big Data NYC 2014. Brought to you by headline sponsor, Juan Disco, with support from EMC, Mark Logic and TerraData. Now, here is your host, Dave Vellante. Welcome back to Big Data NYC, everybody. This is Dave Vellante and this is theCUBE. theCUBE is our live mobile studio. We go out to the events. We extract the signal from the noise. Herb Cunitz is here. He's the president of Hortonworks and Aaron Kelley. He's a general manager over at Microsoft. Gentlemen, welcome to theCUBE. Thanks, Dave. Thank you, Dave. It's good to be here. Thank you for coming. You guys were just down at Hadoop World plus Strata, or Strata plus Hadoop World, however they say it now, and a lot of action going on. I don't know what the official count is. I don't know if you guys know, but there's a lot of people. They had to move out of the Hilton down to the Javits. It's probably not quite full. Javits is a big location, Herb, what's the vibe down there? The vibe is great down there. There's a, if you look at the number of people who are there, as well as the number of partners who are there, it's an increase from last year. A lot of enthusiasm, a lot of excitement. I'd say what's interesting this time, even compared to past years, there's a lot of new users coming through who are new to what they're trying to do with Hadoop and just trying to learn and figure out where the market is and how they get started. The average age of the Hadoop world attendee, I understand, is sort of creeping up, closer to me. But what does that mean? I mean, Erin, you're customers in the enterprise, the guys that you've been doing business with for decades. They're hopping on board to this big data meme generally and specifically looking at Hadoop. How are you helping them through that? Yeah, well, they're looking at it as very complimentary to what we're already doing. So as much as big data is growing and the world of Hadoop is growing very rapidly, we're seeing our traditional relational business grow just as fast. SQL Server is our best measure of that and we added a billion dollars of new revenue next year to more of our $6 billion business. And that growth has been driven by an explosion of data and then more and more of those enterprise customers are saying, I've got my traditional relational data that's very important. I want to combine it now with these new data streams and find new insights. And so what we're working with them on is how do you make that easy to combine these two data sets to create new insights? And that's where we're seeing a lot of them. Do you have a sense as to how much of that has a flywheel effect of the sort of new and the traditional coming together versus just Microsoft gaining share in the enterprise? Well, I think it's a little both but there's certainly a flywheel effect where if you look at typical deployment patterns, people are putting a lot of say log file data in Hadoop. They want to combine that now with customer records. So one very interesting scenario is progressive insurance with the little devices in the cars that track your driving behavior. That's data they put into Hadoop. They want to combine that now with your policy information and see who are the best drivers at the lowest price, who are the worst drivers at the highest price, make those comparisons. And being able to do a single query over both of those is very, very important to them to drive their business forward. And so that's a flywheel where we're selling more traditional data warehouse alongside Hadoop. Yeah, at the same time, you see customers that we talk to, the big enterprise data warehouse guys or the practitioners are saying, all right, we see this new thing coming. We're spending money on it. We're doing some R and D on it. We're trying to figure out where to place our bets going forward. We just had Sean on with some folks from Terra Data, good partner of yours. So you're right there. You're seeing what's happening there. What is happening there? What are you telling customers? Where should they place their bets? How are they sort of, what are they doing with their existing enterprise data warehouses? And what are they putting into Doop? Mm-hmm. So it's an interesting question because what you see in the market now is customers want to get started with Doop for one of two reasons. Either one, they have net new analytic applications they want to build to take advantage of the new types of data that they weren't traditionally storing or managing. And looking to say, how do I manage that in a Doop? But that very quickly comes to, as Aaron said, how do I go tie that with my existing information? And can I make the sum of those two greater than the parts? And how do I start to leverage my existing tools that give me insight into that data but now get access to those new data types in the Doop? That's one pattern. The second is, absolutely, from a cost and economic perspective, there is some going on to say, are there some workloads which are in the data warehouse today, which probably shouldn't have been there, right? ETL becomes your traditional one. That maybe that can be run more cost effectively. It's not a wholesale shift. It's more to say, where should the data be run most effectively and most cost effectively? And how do I combine those two assets for the best benefit of the company? And Aaron, Microsoft's approach to enterprise data warehouse has been different than what I was just talking about. Teradata is appliance. You guys are pure software company. You're letting the ecosystem of the hardware partners sort of have that piece of the value chain. But I wonder if you could sort of give us your perspective on that whole discussion. Well, you know what? We've actually expanded our portfolio in the last year around that. So traditionally we've sold software and that's been our traditional approach. But even this year we released an application called APS, the analytics platform system that we like to describe as big data in a box. So it actually has a Hadoop region as well as an in-memory column store, all in one chassis. And so we've also jumped into the appliance game because we think there are customers that are very interested in a targeted solution that can kind of work very quickly out of the box. So we've gone beyond just traditional software. We have appliance and of course we have a very thriving cloud business as well. Well, I mean Satya's like turning the Microsoft ship toward cloud and I think it's obviously exactly the right play. But the other piece of the question I want to ask you is about open source. It's this wild card that is out there. What's your take on open source? Obviously you're partnering with a company that's probably the most open source company out there that one can think of. And so what's your attitude? Take posture on open source. What's your view of the whole trend? It's really become a critical part of our strategy. Again, if I take a look at our partnership with Hortonworks as kind of the leading example within Microsoft of how we're working with open source, we saw so much momentum around Hadoop three years ago when we signed the deal and started working together that we realized that, hey, this is going to be a really important component of anyone's data platform. It's not going to be the only component but a critical one. And so not only are we partnering with Hortonworks to bring that to Windows and to Azure, but we're actually contributing a lot of Microsoft IP into the Hadoop projects themselves. Our query engine for SQL Server and a column store is part of the Tes and Stinger projects. So those are examples of we actually want to contribute back because ultimately we're looking at the entire solution. So if you think of where Microsoft is today with Excel and the end user, we want to give every end user out there the ability to query and interact with not only traditional data sources but these new data sources. And to do that, we partner with Hortonworks to make that happen because that connection between the new data lakes that are being built in Hadoop with my relational stores is very interesting but ultimately I got to deliver that to an end user who's making a decision every day. And if I can make that simple and easy for them to do then we think it really moves Microsoft's business forward because we're contributing a tremendous, a lot more value to the end customer. Well, so you talk about the users and the business users and all my data's in Excel. So where's Excel fit into this whole thing? No, it's a critical piece. There's about a billion users of Excel worldwide and so we really want to bring Excel forward. The latest versions really are advanced BI tools now that we didn't have in the past and so moving customers forward to that they have their data in Excel today that they can do great analysis on. And Excel is also a gateway for users to basically bring together this analysis that's been done in Hadoop and with a relational store and that's the window for that end user. So it's a familiar environment. It's widely deployed and it's something we're very, very excited about. Herb, I wonder if we could talk about partnerships a little bit. Sure. I was saying earlier, so many companies, so many partnerships being formed. It's hard sometimes to squint through and see the ones that have substance versus the ones that are just sort of, you know, ink on a press release. What makes the partnership between Microsoft and Hortonworks substantive? Can you describe that? So it's an interesting way of looking at it because I think of partnerships, I think of them really in four categories. Is somebody just a named partner which Candle is just a logo on a website that we're saying we're partnered? Is somebody certified that we have, you know, rigorous certification to make sure our two products work together? Do we have something where there's a go-to-market, where we've built a go-to-market of how we're gonna go into the market together through resell or other things? And then lastly is joint engineering. And really I think of those almost in that order from what I call least to most strategic. So what we partner and think about and that's why we're so excited about our partnership with Microsoft, it's all four of those elements of how we look at it. Everything we do in joint engineering, even what we just did recently in our announcement on HTTP 2.2, you know even simple, it may sound simple, but how do you take data and leverage something like Apache Falcon to go be able to back data up into the cloud and Azure? And how can you make that seamless to the user perspective? So they say, hey, this is great, I can now back up and drop this into Azure and it's very easy. That's all joint engineering of how we work together to make it easy. And to your question earlier, Aaron, how we deliver that in open source where Microsoft contributes, we contribute jointly and it's for the good of the community and candidly the good of all the customers that get to take advantage of that. So engineering being the most strategic, I would imagine it's once you've decided to get there, it's one of the easier from the standpoint of engineers you give them a problem to solve and they go solve it. And this politics and engineering for sure, but I would think the go to market stuff and the marketing is some of the really hard stuff. Tell me if I'm wrong and if I'm not, how do you get through that? How do you make that frictionless? Yeah, so I guess what I would say is I totally agree with what Herb is saying around the sort of four levels of a partnership. And the reason why it's worked so well is because we're so aligned on our vision for the future and where we're making investments. As I said before, three years ago when we looked at this market, we said, this is really gonna grow, there's a lot of momentum. We don't wanna do something proprietary, we wanna do something and embrace the community. And because Courtney works is so aligned with that, it's just a very natural connection for us. So when you have the product strategies aligned, it makes it a lot easier to drive sales and marketing because they are complementary. And so we have go to markets where our sellers will go in and pitch a big data project, maybe sell SQL server, Power BI, Azure, the Hortonworks team can come in as well and they can add in greater detail and expertise around what's happening and to do. But in fact, the partnership is so deep on the sales and marketing side that when someone purchases HD Insight in Azure, the account reps for Hortonworks are compensated for that. So we've solved those problems so that you don't have conflicts in the field where we're both trying to sell the same products. And that all comes from a very strategic long-term view that says, here's the direction Hortonworks is going, here's the direction we're going, it's very complimentary. And that's why we can have that deeper layer. You mind if I add something to that, Dave? Because as I think about everything that Aaron just said and you look at a partnership in open source and all the things we're gonna do in a go to market, I think are really important pieces, enabling both parties to work together in a way that the customer can acquire and purchase whatever they need in the way that they're going to consume it. So an important example of that would be, if it's a Hadoop and it's working with Microsoft, you wanna make sure if a customer wants to buy this through an appliance, if they wanna run it on Windows, if they wanna run it on premise, if they wanna run it in a infrastructure as a service and they wanna run it that way, or they wanna run it in a curated pass and run it in Azure, they've got the option always to go run that. And that way it makes it easy from an enabler perspective for a field to say, Mr. Customer, what's the way you wanna consume this? I have all those options available to you and I can make it simple. And they're not shoe-horning in one way or the other to provide a choice. And that's a lot of what we've worked on together to make sure that's seamless. I have a question that's kinda esoteric, but when I think about Microsoft, and I think some people are calling this emergence of a digital fabric. Where companies like Microsoft, Google, Apple, and Amazon are sort of building out this digital fabric and companies like Uber are taking advantage of it. And so you think about the cloud as the infrastructure, as a service piece, and you think about transactional applications and social and collaborative applications. And now data, and it seems like customers who can ride on top of that digital fabric are creating a lot of value. Again, they use Uber as an example, they're Netflix, there's many, many others. Where does data fit in to that digital fabric? You know, that's interesting, because if you think of data, right, years ago, the way many applications were structured, did you have your application with your data store supporting it underneath? And you started to have different silos of data. Then the emergence of the data warehouse is a way to start to do the analytics and other things around that. And now you hear about the data lake, right, is a way of starting to centralize that data. What that is really moving towards, and you see customers starting to push for us, how do I get a data dial tone? How do I have all the data in one place that I can get access to it in the way I wanna do it, consume it through the tools that I'm used to and I'm familiar with, and I'm able to go consume that data when I'm needed, but I'm putting it in one place as a way to go manage and store it. That data dial tone is what you're seeing more and more customers moving towards. Yeah, I mean, I would just augment that. Data is at the core, that's ultimately what you care most about. And so how you access it or how you view it becomes less and less important. The data is really what matters. And so the management of the data, the accessibility of the data, the ability to query the data and get answers from that data so you can make better decisions is really the core of it. Yeah, it really seems like that's where the attention of the buyers is and should be, is taking advantage of that data and turning it into insights. I mean, that sounds sort of trite, but it's true. And then building business models on top of that versus worrying about things like non-differentiated IT management. And you guys are solving that problem with your cloud and you've built out an application infrastructure that is pretty much commercial off the shelf that I can now take advantage of as a buyer and focus on building a business on top of data. Are there examples of companies that are doing that that are not the obvious ones, but that you see that you can share? Well, you know what, we're seeing even the traditional, I'll call them traditional enterprises are taking advantage of this cloud dynamic and the explosion of data to move their business forward. So one of our customers that we're working very closely with in a lot of different scenarios is kind of a strategic partner as peer one as a retailer. And as they've looked forward, they're trying to say, hey, how do I differentiate myself? How do I grow my business? How do I better connect with my customers? And they've turned to our data platform as a way to do that. So everything from, let me do clickstream analysis on people visiting the website or how they interact with my email campaigns. You know, that's an interesting start. Let me take a look at that. But now I also wanna start to combine that with historical sales and say, hey, you purchased seven products from peer one. What's that next logical purchase? And so here are things like machine learning come in where I'm applying machine learning models to predict what's the next logical product that you may wanna purchase. Now this can drive and customize my marketing campaigns and change them from being a generic one to a very targeted specific one. And then similarly, how do I bridge that to the in-store experience? So for years people have been able to do this in websites and e-commerce, but what about in-store? How can I bring that in-store and use similar analytical models? And we've got some very cool stuff going on in our booth here at Strata where we're starting to demonstrate the use of connects as a way of seeing where people are moving through the stores. What displays are they spending the most time on? And then making that information available to a store manager to say, gosh, these displays aren't getting any attention. These ones are. And I can use a rigorous analytic model versus sort of guessing. So very, very powerful ways of even traditional enterprises using these tools to ultimately grow their revenue and improve customer connection. I think retail's a really interesting example because the brick and mortar guys have some real advantages to the extent that they can digitize their businesses and leverage that historical data that they have. They can, I think, really compete. Things are changing. Maybe the footprint of the store is changing, but there's a lot of value in what they produce. Announcements, before we gotta go, you guys, everybody makes big announcements at shows like this. Start with you guys, Sarah. What's Hortonworks announcing? Couple of announcements. One, Hortonworks HTTP 2.2 coming out, right? Next generation in terms of what we're doing around operational management, around security and governance, what you're seeing there. A lot tied in with Microsoft, what I described earlier about being able to do the automatic backup to Azure and all the work you can do there, making sure that we tie in in multiple ways in terms of, for example, certifying HTTP to run on Azure infrastructure as a service and make sure it's running there certified and running well for any consumer who wants to go use that. So just a number of things in terms of how we've deepened and strengthened our partnership as well as what we're doing with the product in terms of the go-to-market. And anything in particular you guys are? Yeah, so to add to the two that Herb mentioned, we're also announcing here that with HD Insight, which is our platform as a service to do venture with Hortonworks, we've added storm support to that, so that's been pretty exciting. So now real-time analytics in a fully managed Hadoop service in the cloud. So that one's very interesting. And then similarly, as I mentioned before around machine learning, we've also started to provide some new samples based on a machine learning service where people can, rather than having to build, say, a recommendation engine from scratch, they can go grab that sample and move forward much faster. We really see the trend we're seeing as I look at the last couple of strata's and Hadoop worlds, you know, you had sort of map reduced batch processing and then you had interactive query with Hive and then you had real-time with Storm. We really see machine learning as that next step. So all of the things we've described, I described part of that are looking at what's happened in the past. Now I really want to predict what's happening in the future and drive better decisions. So a lot of excitement and interest around our machine learning investments. I should add one more, Dave, that's, and it's tied in with what we're doing together with Microsoft and you had asked about go-to-market. So an important piece of go-to-market is one, make the product consumable for our consumer wants to use it, make it consumable in the method they want to use it. But also, how do they get up and running with it? How do they go implement it, manage it, run it and work with it? And that's one where we work with Microsoft but also just recently made an announcement with Avanade. Right, we're Avanade with 2,300 professionals out there can now go out there and say, how do I go implement this on Windows and Azure? How do I go drive that into the market and make their clients successful? Excellent, well gentlemen, thanks for coming on theCUBE, Microsoft, huge footprint, large industry player, zillions of customers, and Hortonworks, kind of the lever that can move large boulders. So thanks very much for coming on, I appreciate your insights. Hey, thanks Dave. All right, keep it right there, everybody. This is theCUBE, we'll be right back. We're live from the Big Apple. This is theCUBE and we're at Big Data NYC, right back.