 Live from San Jose, California, it's theCUBE. Covering Big Data Silicon Valley 2017. Okay, welcome back everyone. Live in Silicon Valley, this is the CUBE coverage of Big Data Week, Big Data Silicon Valley, our event in conjunction with Strata Hadoop. This is theCUBE for two days of wall-to-wall coverage. I'm John Furrier with analysts from Wikibon, George Gilbert, our Big Data, as well as Peter Burris, covering all the angles. And our next guest is Wei Wang, Senior Director, Product Marketing at Hortonworks, a CUBE alumni, and Oliver Chu, Senior Product Marketing Manager for Big Data and Microsoft Cloud at Azure. Guys, welcome to theCUBE, good to see you again. Welcome on theCUBE, appreciate coming on. Thank you very much. So Microsoft and Hortonworks, you guys are no strangers. We've covered you guys many times on theCUBE, HD Insights. You have some stuff happening here. And I was just talking about you guys this morning in another segment, like saying, hey, you know, the distros need a cloud strategy. So you have something happening tomorrow, a blog post going out, what's the news with Microsoft? So essentially I think that we are truly adopting the cloud first. And you know that we have been really acquiring a lot of customers in the cloud. We have that announced in our earnings that more than a quarter of our customers actually already have a cloud strategy. I want to give out a few statistics that Gardner told us actually last year, the inquiries for their end users went up 57%, just to talk about Hadoop and Microsoft Azure. So what we're here is to talk about the next generation. We're putting our latest and greatest innovation in which comes in in the package of the release of HTTP 2.6, that's our last release. I think our last conversation was on 2.5. So 2.6, great latest and newest innovations to put on cloud first. Hence our partner here at Microsoft. We're going to put it on Microsoft HD Insights. It's super exciting. And you know, Oliver, one of the things we've been really fascinated with and been covering for multiple years now is the transformation of Microsoft. Even prior to Satya, who's a CUBE alumni by the way, been on theCUBE when we were at Excel event at Stanford. So CEO of Microsoft, CUBE alumni, good to have that. But it's interesting, right? I mean, the Open Compute project, they donated a boatload of IP into the open source, heavily now open source. Brendan Burns works for Microsoft. He's seeing a huge transformation of Microsoft. You've been working with Hortonworks for a while. Now it's kind of coming together and one of the things that's interesting is the trend that's teasing out on theCUBE all the time now is integration. You're seeing this flash point where, okay, I got some Hadoop, I got a bunch of other stuff in the enterprise equation that's kind of coming together and things like IoT and AI is all around the corner as well. How are you guys getting this all packaged together? Because this kind of highlights some of the things that are now integrated in with the tools you have. Give us an update. Yeah, absolutely. So for sure, just to kind of respond to the trend, Microsoft kind of made that transformation of being cloud first many years ago. And it's been great to partner with someone like Hortonworks actually for the last four years of bringing HD Insight as a first party Microsoft cloud service. And because of that, as we're building other cloud services around in Azure, we have over 60 services. I'm gonna think about that. Paz and IaaS services in Microsoft, part of the Azure ecosystem, all of this is starting to get completely integrated with all of our other services. So HD Insight, as an example, is integrated with all of our relational investments, our BI investments, our machine learning investments, our data science investments. And so it's really just becoming part of the fabric of the Azure cloud. And so that's just a testament to the great partnership that we're having with Hortonworks. And so the inquiry comment from Gardner, and we're seeing some of the things on the Wikibon side on our research team, is that now the legitimacy, I say, of seeing how Hadoop fits into the bigger picture, not just Hadoop being the pure play big data platform, which many people were doing, but now they're seeing a bigger picture where I can have Hadoop and I can have some other stuff all integrating. Does that kind of where this is going from you guys' perspective? So yeah, it's again some statistics we have done tech validate service that our customers are telling us that 43% of the responders are truly using that integrated approach, right? The hybrid, they're using the cloud, they're using our stuff on premise to actually provide that integrated end to end processing workload. There now I think people are less think about, I would think a couple of years ago, people probably think a little bit about what kind of a data they want to put in the cloud, what kind of workload they want to actually execute it in the cloud versus their on premise. I think what we see is that line started to blur a little bit and given the partnership we have with Microsoft, the kind of the enterprise ready functionalities and we talk about that for a long time, last time I was here, talk about security, talk about governance, talk about just layer of integrated layer to manage the entire thing, either on premise or in the cloud. I think those are some of the functionalities or some of the innovations that make people a lot more at ease with the idea of putting the entire mission critical applications in the cloud. And I want to mention that, especially with our blog going out tomorrow, that we will actually announce the Spark 2.1 in which in Microsoft Azure HD insight, we're actually going to guarantee 99.9% of SLA, right? So it's for that, it's for enterprise customers in which many of us have together that it's truly an insurance outfield that people are not just only feel at ease about the data that were they going to locate, either in the cloud or within their data center, but also the kind of speed and response and reliability. Yeah. Oliver, I want to cue off something you said, which was interesting, that you have 60 services and that they're increasingly integrated with each other. The idea that Hadoop itself is made up of many projects or services. And I think in some amount of time, we won't look at it as a discrete project or product, but something that's integrated with a, that together makes a pipeline, a mix and match. And I'm curious if you can share with us a vision of how you see Hadoop fitting in with a richer set of Microsoft services where it might be SQL server, it might be streaming analytics, what that looks like. And so the issue of sort of a mix and match toolkit fades into a more seamless set of services. Yeah, absolutely. And you're right, like Hadoop and Wei will definitely reiterate this is that Hadoop is a platform, right? And certainly there is multiple different workloads and projects on that platform that do a lot of different things, right? You have Spark that can do machine learning and streaming and SQL like queries. And you have Hadoop itself can do batch, interactive streaming as well. And so you see kind of a lot of workloads being built on open source Hadoop. And as you bring it to the cloud, it's really for customers that what we found and kind of this new Microsoft that's often talked about is it's all about choice and flexibility for our customers. And so some customers want to be 100% open source Apache Hadoop. And if they want that, HTN sites the right offering. And what we could do is we can surround it with other capabilities that are outside of maybe core Hadoop type capabilities. If they want to do media services all the way down to other technologies that nothing related to specifically to data and analytics. And so they can combine that with a Hadoop offering and blend it into a combined offering. And there are some customers that will blend open source Hadoop with some of our Azure data services as well because it offers something unique or different. And but it's really a choice for our customers. Whatever they're open to, whatever their kind of their strategy for the organization. Is there just to sort of then compare it with other philosophies. Do you see that notion that like Hadoop now becomes a set of services that might or might not be mixed and matched with native services. Is that different from how Amazon or Google, you know, you perceive them to be integrating Hadoop into their sort of pipelines and services? Yeah, it's different. Cause I see Amazon and Google, like for instance, Google kind of is starting to change their philosophy a little bit with introduction of Dataproc. But before you can see them as an organization that was really focused on bringing some of the internal learnings of Google out into the marketplace with their own, you can say proprietary type services with some of the offerings that they have. But now they're kind of realizing the value that Hadoop, that patchy Hadoop ecosystem brings. And so with that comes the introduction of their own managed service. And for AWS, you know, they're kind of their roots as I as sort of speak is kind of the roots of their cloud. And they're starting to bring kind of other systems, very similar to I would say Microsoft strategy. For us, we're all about making things enterprise ready. So that's what the unique differentiator and kind of what way kind of alluded to. So for Microsoft, all of our data services are backed by 99.9% service level agreement, including our relationship with Hortonworks. And so that's kind of one. Just say that again one more time. 99.9% of time. And if that were ever to SLA, and so if that's a guarantee to our customers. And so if anything were. One more time. It's a guarantee to our customers. This is important. SLA, I mean Google Next didn't talk much about last week, their cloud event, they talked about speeds and feeds. Exactly. Not a lot of SLA's. This is mandate for the enterprise. They care more about SLA's, so not that I don't care about price, but they'd much rather have solid bulletproof SLA's than the best price. Because it's a total cost of ownership. And that's really the heritage of where Microsoft comes from because we have been serving our on-premises customers for so long. We understand what they want and need and require for a mission critical enterprise ready deployment. And so our relationship with Hortonworks, absolutely 99.9% service level agreement that we will guarantee to our customers and across all of the Hadoop workloads, whether it be Hyde, whether it be Spark, whether it be Kafka, any of the workloads that we have on H&M site is enterprise ready by virtue, mission critical, built in, all that stuff that you would expect. Yeah, you guys certainly have great track record enterprise, no debate about that, 100%. Back to you guys, I want to talk about a step back and look at some of the things we've been observing kicking off this week at Strata Hadoop, since it's our eighth year, covering Hadoop world now. It's been evolved into a whole huge thing with big data SV and big data NYC that we run as well. The vets that were made. And so I've been intrigued by HD insights from day one, especially the relationship with Microsoft's, got our attention right away because of where we saw the dots connecting to which is kind of where we are now. That's a good bet. So we were looking at what bets were made and who's making which bets when and how they're panning out. So I want to just connect the dots. The bets that you guys have made, the bets that you guys have made that are now paying off. And certainly we've talked before camera, revolution analytics, obviously now looking real good middle of the fairway, as they say. Bets you guys have made that you're good. Hey, that was a good call. Right. And I think that first and foremost, we as Hortonworks to support machine learning, we don't call it AI, but we are probably the one that first to put always put the spark, right? In Hadoop, I know that spark has gained a lot of traction, but I remember that in the early days, we are the ones that as a distro that going out there, not only just verbally talk about support of spark, truly put it in our kind of distribution as one of the component. We actually now in the last version, we are actually allows also flexibility. You know spark, how often they change. Every six weeks they have a new version. And that's kind of in the sense of running into paradox of what actually enterprise ready is. The enterprise within six weeks, they can't even roll out entire process, right? If they have a workload, they probably can't even get everyone to adopt that yet within six weeks. So what we did actually in the last version which we'll continue to do is to essentially support multiple versions of spark, right? We essentially to talk about that. And the other batch we have made is about Hive. We truly made that as a kind of initiative behind project stinger, stinger initiative and also have tasks and now with LAP. We made that effort to joining with all the other open source developers to go behind this project that make sure the sequel is becoming truly available for our customers, right? Not only just affordable, but also have the most comprehensive coverage for sequel and C2011, but also now having that almost sub-second interactive query. So I think that's the kind of batch we made. You got the compatibility sequel, then you got the performance advantage that are going on in this databases where it's in memory or SSD. That seems to be the action. Oliver, you guys make some good bets. So let's go down the list. Yeah, let's go down the memory lane is, and I always kind of want to go back to our partnership with Hortonworks is we partnered with Hortonworks really early on in the early days of Hortonworks existence. And the reason we made that bet was because of Hortonworks strategy of being completely open, right? And so that was a key decision criteria from Microsoft that we wanted to partner with someone whose entire philosophy was open source and committing everything back to kind of the Apache ecosystem. And so that was a very strategic bet that we made. It was bold at the time too. It was very bold, yeah, at that time, yeah. Because Hortonworks at that time was a much smaller company than they are today, right? But we kind of understood where the ecosystem was going and we wanted to partner with people who were committing code back onto the ecosystem. And so that I would argue is definitely one, one really big bet that was a very successful one and it continues to play out even today. Other bets that we made like we talked about prior is our acquisition of revolution analytics a couple of years ago. And that's playing out as well. It just keeps on rolling, it keeps on rolling. Rolling, rolling, it's awesome. All right, final, where I just want to get an update on the data science experiences you guys have. There's your any update there, what's going on, what seems to be the data science tools that are accelerating fast? And in fact, some are saying that it looks like the software tools years and years ago. So a lot more work to do. So what's happening with the data science experience? Yeah, absolutely. And so just tying back to that original comment around our revolution analytics, that has become Microsoft R server. And we're offering that available on-premises and in the cloud. So on-premises, it's completely integrated with SQL server. So all SQL server customers will now be able to do in-database analytics with R built into the core database. And that we see as a major win for us and a differentiator in the marketplace. But in the cloud and in conjunction with our partnership with Hortonworks, we're making Microsoft R server available as part of our integration with Azure HD Insights. So kind of just tying back all that integration that we talked about. And so that's built in so any customer can take R and paralyze that across any number of Hadoop and Spark nodes in a managed service within minutes, clusters will spin up and they can just run all the data science models and train them across any number of Hadoop and Spark nodes. And so that is a- That takes the heavy lifting away on the cluster management side so that the focus on their job. Absolutely. Awesome. Well guys, thanks for coming on. We really appreciate your way with Hortonworks. And we have Oliver Chu from Microsoft. Great to get the update. And tomorrow at 10.30, the cloud first news hits. Cloud first, Hortonworks with Azure. Great news, congratulations. Good cloud play for Hortonworks. The Cube, I'm John Furrier with George Gilbert. More coverage live in Silicon Valley after this short break.