 Live from New York, it's theCUBE. Covering theCUBE, New York City, 2018. Brought to you by SiliconANGLE Media and its ecosystem partners. Hello everyone, welcome to theCUBE here in New York City. We're live for CUBE NYC. This is our big data now, AI, now all things cloud. Nine years covering the beginning of Hadoop. Now into cloud and data, it's the center of the value project. I'm John Furrier with Dave Vellante. Our special guest is Rob Beard and CEO of Hortonworks. CUBE alumni have been on many times through this great support of theCUBE. Legend in open source, great to see you. It's great to be here, thanks. Yes, absolutely. So one of the things I wanted to talk to you about is that open source certainly has been a big part of the ethos, you're seeing it in all sectors, again growing, even in blockchains, didn't see open ethos is growing. The role of data now certainly the center. You guys have been on this vision of open data, if you will, making data and move in flight, maybe at rest, all these things are going on. Certainly the Hadoop world has changed, it's not just Hadoop and data lakes anymore, it's data, all things data is happening. This is core to your business, you guys have been banging this drum for a long time, stocks at an all time high, congratulations on the business performance. So it's working, things are working for you guys. I think the model and the strategy are really coming together nicely. And to your point, it's about all the data. It's about the entire life cycle of data and bringing all data under management through its entire life cycle and being able to give the enterprise that accessibility to that data across each tier on-prem, private cloud and across all the multi-clouds. And that's really changed really in many regards the overall core architecture of Hadoop and how it needs to manage data and how it needs to interact with other data sources. And our model and strategy has been about not going above the Hadoop stack, but actually going out to the edge and bringing data under management from the point of origination through its entire movement life cycle until it comes at rest and then have the ability to deploy and access that data across each tier and across a multi-cloud environment. And it's a hybrid architecture world now. You guys have been on this trend for a while and now it's kind of getting lift. Obviously you're seeing the impact of cloud, impact of AI because the faster compute you have, the faster you can process data, the faster data can be used machine learning. It's a nice flywheel. So again, that flywheel is being recognized. So I have to ask you, what has in your opinion been the impact of cloud computing, specifically the Amazons and the Azures and now Google? We're certainly AIs in the center of the air power but it's now hybrid cloud is validated with Amazon announcing RDS on-premise on VMware. That's the first ever Amazon ever on-premise activity. So this is clearly a validation of hybrid cloud. How has the cloud impacted the data space? And if you will, because it used to be data warehouse and cloud has changed that. What's your opinion? Well, what it's done is it's given an architectural extension to the enterprise of what their data architecture needs to be and the real key is it's now, it's not about hybrid or cloud or on-prem, it's about having a data strategy overall. And how do I bring all of my different assets and bring a connected community together in real time? Because what the enterprise is trying to do is connect and have higher velocity and faster visibility between the enterprise, the product, their customer and their supply chain. And to do that, they need to be able to aggregate data into the best economic platform from the point of origination maybe starting from the component on their product, a single component and to be able to bring all that data together through its life cycle, aggregate it and then deploy it on the most economically feasible tier whether that's on-prem or private cloud or across multiple public clouds. And our platforms with HDF, HDP and Data Plane and complete that hybrid data architecture and by doing that, the real value is then the cloud, AI and machine learning capabilities have the ability now to access all data across the enterprise whether it be at their tier in the cloud or whether that be on-prem. And our strategy is around bringing and being that fabric to bring all the interconnectivity irrespective of whether it sits on the edge and the cloud or somewhere in between because the more accessibility AI has to data the faster velocity of driving value back into that AI cycle. Yeah, people don't want to move data if they don't have to. And so, and we've been on this for a while that this idea that you want to bring the cloud model to your data, not the data to the cloud always. And so how do you do that? How do you make it this kind of same, same environment? What role does Hortonworks play in it? The first thing we want to do is bring the data under management through its life cycle. That's where HDF goes to the edge, brings the data through its movement cycle, aggregates the streams. HDP is the data at rest platform that can sit on-prem in a public cloud or a private cloud. And then data planes that fabric that ensures that we have connectivity to all types of data across all tiers and then serves as the common security and governance framework, irrespective of which tier that is. And that's very, very important. And then that then gives the AI platforms the ability to bring AI onto a broad array of data that they can then have a higher and better impact on than just having an isolated AI impact on just a single tier AI data in the cloud. Well, that message seems to be resonating. We talked earlier about the stock price, but also, I mean, I think Neil Boucherie and Frank Slutman popularized the metric of number of seven-figure deals. You guys are closing some big deals. And remember, in the early days, Robert Bordenberg, people are like, how are these guys going to sell anything? It's all open source. And you're doing a lot of million-plus dollar deals. So it's resonating not only with the street, but also enterprises. Your thoughts? Last quarter, I think the key is that, I think the industry really understands the investors understand the enterprises really now understand the importance of hybrid and hybrid cloud. And that it's not going to be all about managing data lakes on-prem. All the data is not going to go and have this giant line of demarcation and now all reside in the cloud. That it has to coexist across each tier. And our role is to be that aggregation point. And you've seen the big cloud players now, all, at least the big three, all have on-prem strategies. Azure with Azure Stack, Google, we saw Kubernetes on-prem. And even AWS now, the last holdup, putting RDS on-prem announced that VMworld. So they've all sort of recognized that not everything's going to go into the cloud. So that's got to be good confirmation for you guys. It's great validation. What it also says to us though is, we must have cloud-first architecture and a cloud-first approach with all of our tech. And the key to that is from our standpoint within our strategy is to containerize everything. And we had some announcement earlier this week. It was really a three-way announcement between us, Red Hat and IBM. And the essence of that announcement is we've adopted the Kubernetes distro from Red Hat to where we are containerizing all of our platforms with the Red Hat Kubernetes distribution. And what that does is then gives us the ability to optimize our platforms for OpenShift, the Red Hat pass, and optimize then the deployment of that in the IBM private cloud. And then naturally data plane will also then give us the ability to extend those workloads, those very granular workloads, up into the public clouds. And we can even leverage their native object stores. So that's an interesting love triangle, right? I mean, you and Red Hat are kind of birds of a feather with open source. IBM's always been a big proponent of open source, funded Linux in the early days. And then brings just a massive channel and brand to that world. Yes, and this is really going to accelerate our movement into a cloud first architecture with pure containerization. And the reason that's so important is it gives us that modularity to move those applications and those workloads across whichever tier is most appropriate architecturally for it to run and be deployed. You know, we said this on theCUBE many, many years ago and continues to be this theme. Enterprises want, really they want hardened solutions, but they don't mind experimenting. And Stu Miniman and I were always talking about and comparing OpenStack ecosystem to what's happened in the Hadoop ecosystem. There's some pockets of relevance and it's a lot of work to build your own and OpenStack has a great solution for certain use cases, now mostly on the infrastructure side. But when cloud came in and changed the game because you saw things like Kubernetes, I mean, we're here at the Hadoop show that started with Hadoop, now it's AI. The word Kubernetes is being talked about. You mentioned hybrid cloud. These aren't words that were spoken at an event like this. So the IT problem in multicloud has always been a storage issue. So you do some storage work, you got to store the data somewhere. But now when you talk about Kubernetes, you're talking about orchestration around workloads, the role of data in workloads. This is what Enterprise IT actually cares about right now. This is like not like a small little thing. It's a big deal because data is not only in the workloads, they're using instrumentation with containers, with service meshes around the coin. You're starting to see policy. This is hardcore B2B enterprise features. This is where, what we're seeing is a massive transformational shift of how the IT architecture is gonna look for the next 20 years, right? The IT world has been horribly constrained from this very highly configured, very procedural based applications. And now they want to create high velocity engagement between the enterprise, their product, their customer and their supply chain. They were so constrained with these very procedural based applications and containerization gives the ability now to create that velocity and to move those workloads and those interactions between four pillars. Now let's talk about the edge. Because with Pendulum is clearly swinging sort of back to some decentralization going on. So and the edge is to us is a data play. We talk about it all the time. What are your thoughts on the edge? Where does Hortonworks fit? What's your vision of the data model and how that evolves? You know, that goes back, you know, the insight to that would be, you know, our strategy and what we did and had the great fortune, quite frankly, of having the ability to merge on Yara and Hortonworks back in 2015. And that we wanted, and the whole goal of that, besides working with the great team, Joe Witt had built is being able to get to the edge. And what we wanted to have the ability to do was to operate on every sensor, on every device at the edge for the customer so that they can bring the data under management whatever that may be through its entire life cycle. So from point of origination through its movement until it comes at rest, because our belief is that if we can bring enough intelligence and faster insights as that data is being generated and as events or conditions are happening, moving or changing before it ever comes to rest and we can process and take prescriptive action leveraging AI and machine learning as it's in its life cycle, we can dramatically decrease the amount of data we have to bring to rest. We can just bring the province, the metadata to rest and have that insight. And we try to get to these high velocity, real time insights starting with the data on the edge. And that's why we think it's so important to manage the entire life cycle. And then, but what's even more important is then put that data onto whatever tier that maybe bring it back to rest in a data lake on-prem to aggregate with other like data structures or it may be take it into cold storage on a native object store in a cloud that has the lowest cost of storage structure for a particular time. Or take an action on the edge and leave it there. Yeah. You guys definitely think about the edge in a big way, that's pretty obvious. But what I want to get your thoughts on is an emerging year we're watching and I'll call it, for lack of a better description, programmable data. And you mentioned data architectures being set up for probably set a 10, 20 year run for enterprises they set up their data architecture with the cloud architects. Making data programmable is kind of a DevOps concept, right? And this is something that you guys have thought about with the data playing. What's your reaction to this notion of making data programmable? Because when you start talking about Kubernetes, you're going to have stateful applications, stateless applications. You have new dynamics. I call it API 2.0 happening. Whole new infrastructure happening. Data has to be programmable. There's going to be policy around it. The role of data is certainly changing rather than storing it somewhere. What's your view of programmable data making it programmable? Well, you've got to be able to truly have programmable data. You have to be, you can't have slices of accessibility or window. You have to understand the lineage of that entire data and the context of that data through its entire lifecycle. That's step in point number one. Point number two is you have to be able to have that containerized so that you can take the module of data that you want to take prescriptive action against or create action against a condition and to be able to do that in granular bites or chunks. And then you've got to have accessibility to all of the other contextual data, which means whether that's as it's in motion, as it's at rest, or as it's contextual cousin, if you will, that sits up in an object store on another tier in a public cloud. And what's important is that you have to be able to control and understand the entire lineage of that. And therefore that's where, you know, our second step in this is data plane and having the ability to have a full security model through that entire architectural chain, as well as the entire governance and lineage leveraging Atlas through data plane. And that then gives you the ability to take these very prescriptive actions that are driven through AI and machine learning insights. And it makes you very agile, love it. I mean, the ethos of open source and dev ops is literally being applied to everything. We're seeing it at the network layer, you're seeing it at the data layer, you're starting to see this concept of dev and ops being applied in a big way. The next, you know, we, previous years we've talked about what we're trying to accomplish. And we started Hortonworks, it was about changing the data architecture for the next 20 years and how data was going to be managed. And that's had, to your earlier point, we opened up the show, that's had twists and turns, Hadoop's evolved, the nature and velocity of data has evolved in the last five, six, seven, eight years. You know, it's about going to the edge, it's about leveraging the cloud. And, you know, we're very excited about where we're positioned as this massive transformation's happening. And what we're seeing is the iteration of change is happening at an incredibly fast pace, even much more so than it was two, three years ago. Yeah, the clock speed is definitely up there. Data is working, people putting it to work. Hortonworks. They're able to get more value faster because of it. Yeah, it's great. The data economy is here and now and the enterprise understands it. And so they want to now move aggressively to change and transform their business model to take advantage of what their data is giving them the ability to do. Yeah, that's great. They always want the value, they want it fast and anything gets in the way. They'll remove the blockers, as we say. All right, this is theCUBE here. Rob Beard, CEO of Hortonworks, giving us a vision, but also an update on the company. Data at the center of the value proposition. This is about AI, it's about big data. It's about the cloud. This is theCUBE bringing you the CUBE data here in New York City. CUBE NYC, that's the hashtag. Check us out on Twitter. Stay with us for live coverage all day today and tomorrow here in New York City. We'll be right back after this short break.