 Live from San Jose, California, in the heart of Silicon Valley, it's theCUBE. Covering Hadoop Summit 2016, brought to you by Hortonworks. Now, here are your hosts, John Furrier and George Gilbert. Welcome back everyone, we are live here inside theCUBE here at Hadoop Summit 2016, live in San Jose in Silicon Valley for SiliconANGLE Media's theCUBE, our flagship program, we go out to the event and extract the signal from the noise. I'm John Furrier, we're with my co-host, George Gilbert, Big Data Analyst at Wikibon, our next guest, Sean Connolly, Vice President of Corporate Strategy at Hortonworks, always great to have on theCUBE and get the 3D chessboard laid out. Welcome back. How's it going, John and George? How's the 3D chessboard look like now? Because the ecosystem is flourishing, still trying to get the simplification, still trying to get the lower total cost of ownership, but with Docker containers going crazy, we just had Docker gone. Every single major vendor has pretty much stood up their path, it's a partnering world, it's a developer ecosystem, it's an enterprise simplicity, with interoperability, with data everywhere, and IoT right around the corner. I mean, okay, bottom line me. The dimensions. How are we doing? Historical and real time around data, right? Then you have the containerization stuff which makes it easy to deploy and that's another vector. Then you have basically the prevalence of cloud and data center deployments which adds another twist into it. So I think that this technology space is really moving in very interesting ways. I think the ultimate goal will be how do you get these modern data applications that can help drive new insurance models or what have you? How do you get them deployed quickly and easily? And I think the room, particularly in his keynote, covered a little bit of that. Take a minute to summarize that because I want you to explain that to the audience because again, it is kind of complicated. There is pressure for the enterprises to deploy faster while they're re-architecting and re-platforming the enterprise. There is need for modern apps and some sequencing that can be done. Obviously, every use case is different, but take us through specifically what was talked about on stage around this modern app. Share an example if you can. Yeah, so we refer to this sort of the architecture and re-platforming as a connected data architecture that basically embraces the fact that you have data moving in real time, everywhere from the edge, from connected cars, connected cities or what have you, to traditional data that might be one-prem or collected even from manufacturing line data, right? And so being able to get connected car data in the here and now and operational stuff, being able to analyze that with manufacturing line data so you could do root cause issues that might be on the road today from that operational data, that is a hyper-connected world. That's data's born in the cloud, data's born in the data center, and how do you actually enable these new apps that need access to that data when they need it, where they need it, right? And so I think that was the theme, if you will, of the keynotes this morning and the conference, and you have these really cool technologies that are related, things like Docker and others, that are, I refer to Docker as laying the new light rail, if you will, for these new applications to be able to run wherever they need to run in a very lightweight way. And so that's really redefining sort of the substrate, if you will, that is going on. What's the impact of Docker and the container trend, which obviously is massive and exploding, to the Hadoop data ecosystem? Because that's traditionally been a great way for developers at the front end, application developers to put stuff in a container and let the infrastructure as code devops do its piece. Has it simplified your world or has it made it more complex? So back in 2009, I had conversations with the fidelities of the world, et cetera, as platform as a service technologies and cloud was just starting to emerge. And so the application-centric approach to things, one of the things they said back then was you also need to solve the data problem, right? And so you need to get the application logic to where it needs to be, and if the data's in the cloud or on-prem, you need it to be portable, right? So I think they go hand in glove in an architecture. Increasingly, applications are data-driven applications. And so that agility, that devops agility of getting that logic where it needs to be, I think is a perfect compliment to what we're doing in the big data ecosystem, so. So we're seeing, every 10 years or so, we see the pendulum swing and a different architecture emerges. So, but a platform seems to take place when a bunch of disjointed capabilities is brought together into a coherent API and I guess an administrative model. How is that taking shape? And we hear of all the different projects, but help us paint a picture of how all the pieces fit together. Sure, so I think there's a couple of ways platform vendors can approach things. One is to sort of centralize things and come out around the one place that you need to do all your work, right? When we started Hortonworks, we actually had a many philosophy because in the open source realm, if you look at it, the age of data is being driven mostly from the Apache Software Foundation. And each week or each month, there's a new technology that's a data related technology that's being established in there. So what you need to do is you need to set a architectural substrate that is built for onboarding new innovation when it arrives, right? And so to your point is, yes, you need centralized operations, security and governance, but you also need a platform that's built for receiving innovation in a consistent way, in an agile way, and in a way where that innovation can run on cloud or on-prem. And so that's not going to stop. It's exponentially increasing curve of innovation. So I want to just kind of shift real quick on just to interject George's question. That's a great advice. I know you guys have done that. That's one of your things that you have done in your career, other places, now Hortonworks, where you look at and monitor digitally because they're all connected, developers, IRC back channel to Slack, now all these community forums. It's the GitHub generation, as I refer to all the developers. So how are you guys doing? Share your experience on how Hortonworks has been building that substrate. How have you been successful and how would that translate to potential customers to take that best practice? Yeah, so I always like to answer that question by basically going in the morning keynotes, listen to progressive insurance, listen to Capital One on their cybersecurity story. It's making use of the emerging technologies around NIFI and Storm and Kafka, meets Hadoop and Spark and in an integrated architecture for cyber threat analytics, right? These stories at this event are really the true telling of being able to deploy a flexible architecture that enables them to onboard the innovation and unlock new business models. Just to try to pin you down on this for a second. So the substrate is what? Core technologies that the open source community has built on. Productizing it in a way that enterprises can actually depend on it to George's earlier point, right? Where it doesn't have the operational, consistent operational experience, consistent security experience, et cetera. So you need to appeal to both sides. I wanted to ask Rob this didn't have time for it, but when I sat down with my one-on-one exclusive, which it's on the crowd chat, go to YouTube and search Rob Bearden. You'll see I had a one-on-one sit down 20, almost 50 minutes of interview. I asked him the question, I'll ask you the same question. What's the Hortonworks growth strategy? Obviously, and in context to that, there's your competitors like Cloudera that are private. Sure. Worth billions of dollars. Your market cap today is like half a, just over half a billion. So significantly undervalued, vis-a-vis the privates, you have a softening of the market right now with privates doing down rounds. So you're seeing, you guys have been public and have been like, hey, we're going to put it out there. You're not really having really pivoted anywhere. So you're out in the open. Our business is as transparent as our code, as I like to say. Which is a good answer, by the way. You're out there. However, what's the dynamic in the ecosystem because there are a lot of people bleeding right now that are private companies. We've seen some down rounds. We've seen some pivots. What does that mean? Is it just a systematic of the capital markets? Does it point to the ecosystem? And how does your growth strategy get through all this mess? And Rob will answer it in a different way. I have a more direct blunt way of phrasing it is, I never get hung up on what private valuations, unicorn nonsense, right? Is that's a false factor on valuing the impact that customers are getting from technology. What really matters is what are the stories and how are those companies using the technology transforming their business? John Deere transforming from a tractor company to an analytics company. Analytics down to the planted seed level type thinking. So I think that's the more interesting thing. So from a business perspective, I focus the business one. We're in the- All right, so just to get to the next level, I get that. How the customers use it, that's a reflection of the company. But there is a psychology of companies that have employees that are under pressure to create revenue. So if I have a high valuation, the pressure from either the management team and the board of directors is to go public, get some liquidity, or raise another round. So there are potentially pressure situations that could impact the customer. Are you seeing that or is that not an issue? Not in our space. We have a renovate cost savings vector and we have an innovation unlock new innovations use cases. So the reality is the technology space that we're in is helping most of the companies bend the curve in troubling times. So you're on the innovation curve. And in a way where it helps them address both the cost side and the top line revenue side. So we have- But the cost is a consequence of the innovation. Exactly. And you're doubling your- And the model, right? The subscription model, the harnessing innovation to bring it in, integrate with existing assets, those types of things. So from a subscription perspective, I think the other evolution, which you're going to hear, like we had Microsoft demoing a Hadoop Spark experience in the cloud is, you know, we transition from every year's an election year in a subscription business to every hour is an election hour in the cloud business, right? So it's about enabling customer success. You mean the customer's ability to buy more basically? Yeah. Vote with their checkbook. Right, and that's the ultimate success metric, right? Is are they deploying more? Are they growing? Are they solving real problems? Because if they're not, then the next election hour is they turn it off, right? And so that excites me from a business perspective. That's a new dynamic in open source. Yep. In a way. Yep, yep. Got it, George. So we have this, we talked about the emerging platform, you know, combines together a bunch of components. Usually then there's an abstraction for lack of a better term where developers can see this platform as a whole. But we're also seeing the emergence of the cloud native services, whether it's, you know, Amazon, Kinesis Firehose, DynamoDB, Redshift. Azure HD Insight, right? Yep. HD Insight, but they don't have to be used in isolation. They can mix and match other native services in there. And we talked about that. They can mix and match other native services, but they can also, in a connected data architecture, integrate with your on-premises solutions as well. Right. Because that's the reality of many large companies their inherently high, their data is inherently hybrid. All right. Okay. And so they need to connect that architecture. Embrace the emerging cloud native services, but bring along their existing investments that may be on-prem as well. So without naming competitors, let's say customers, maybe because of the high overhead cost of admins with specialized skills as, you know, companies get smaller, the customers get smaller, they want to do more in the cloud. Where are some of the other distro vendors in harmonizing the experience for the developer and the admin on-prem and on the cloud? So I tend to not follow competitive landscape as much as what the cloud offers me as a software platform vendor is I'm able to be able to create very prescriptive environments for data science with full tooling and environment for doing that work in the cloud. If you're an ETL focused person, you can do your curation and your data munging in a very prescriptive way. One-prem, you're able to run those mixed workloads in a shared data lake. In the cloud, you actually can spin up very focused targeted use cases, right? And so from creating technology that solves very specific problems, the cloud really offers me a great way to connect with a very targeted set of users, right? So there may be a billion users in total using this platform from BI to data scientists or what have you, but you can actually create a much more prescriptive experience in the cloud and boot it up instantaneously. That's really exciting because that basically is the path to productivity is instantaneous, right? There's no hardware expenditure up front. You just swipe your credit card and you go. That's exciting. There's still work to do on your party in terms of if you have a scenario for the data scientist or the ETL developer or whatever, you have to make that experience so it hides everything behind it. Seamless, prescriptive, minimal knobs and dials, right? It's great that you have a hundred ways to configure it, but for my use case, what's the prescriptive way so I can just go do my job, right? Prescriptive means a combination of on-prem and in the cloud or one or the other. Yes, typically I think in the cloud you can actually be far more prescriptive because it's more instantaneous, right? All right, Sean, thanks for coming on, sharing your insights on theCUBE. Appreciate it, a quick final word. The difference between this event in Dublin, obviously the two conferences, you know by the way we have our new crowd pages.co slash Hadoop, some which is our, we call Cube 365. All the videos will be there, we're engaging the site. The vibe between the two event windows, obviously Dublin here, great innovation show, a lot of education, theme change, extension, what's the difference? So I think it's an evolution of where the market is. I think the European audience focuses a little bit more on the pragmatic use cases and adoption, very similar use cases like British gas with connected homes and smart meters and things like that, but you'll hear more technology in sort of the next wave at a San Jose event. Just a big tent event. Okay, Sean Connolly, Vice President of Corporate Strategy at Horton works here inside theCUBE at Hadoop Summit. This is the premiere show for the open source, big data community around Hadoop and all the different communities. I'm John Furrier, George Gilbert, you're watching theCUBE right back.