 Live from New York, it's theCUBE. Covering Big Data New York City 2016. Brought to you by headline sponsors, Cisco, IBM, NVIDIA, and our ecosystem sponsors. Now, here are your hosts, Dave Vellante and Peter Burris. We're back in the Big Apple. This is theCUBE, the worldwide leader in live tech coverage. We're here at Big Data NYC. Big Data Week is part of Strata plus Hadoop World. Sean Connelly is here as the Vice President of Strategy at Hortonworks, longtime friend in KubeLum. Great to see you again. Thanks for having me. We're back at the same venue last year, always a pleasure. Yeah, it's good. We're growing, I guess the event's growing. We haven't been over there yet, but some of our guys have, but what's it like over there? It feels the same. Some of the different use cases I think last year was streaming, we're hearing more machine learning and things like that as far as use cases. So, similar vibe. Yeah, so things are evolving, right? How's Hortonworks evolving? So, we continue to report our quarterly earnings as the only publicly traded company in this space. Things from a business perspective are doing well. Our connected data platforms strategy, which we unveiled at the beginning of this year, just re-round data emotion and data at rest and enabling these new gen transformational applications continues to play out. The data and motion piece is sort of decoupled and unrelated to a Hadoop platform. It's really about acquiring and handling the FedEx for data delivery type notions, data logistics, secure transmission. That's based on the Apache NIFI tech that was originally built sort of at the NSA over the past eight years. So, really a nice robust piece of technology that we've pushed out to the edge in our latest release. So, you can really skin these down and does secure site to site transmission, a lot of sophisticated capabilities there. So, we're seeing a lot of uptake in that sort of architectural vision, the products are maturing, both on-prem and in the cloud. Things are pretty exciting. Well, this cloud thing seems pretty real. You get a lot of traction, right? Everybody kind of knew it was coming, but what are you seeing? Yeah, so it was, I guess I started the journey back in 2009 when I was at spring source and Paul Moritz was CEO of VMware. And that was pre sort of cloud at that time. We were talking about this notion of platform as a service and things like that. And that resonated really well with folks back then, but their main ask was how do you solve the data problem? How do you actually get the data to the apps that need it? Fast forward to 2016, I think. It's been a lot of open source innovation, a lot of commercial innovation, the rise of cloud for providing a fast path to value, booting up these use cases. It's a fascinating transition to watch. Many of our customers are, people use the word hybrid. What that means to me is they'll have data center workloads or multi data center workloads. They also have cloud workloads, sometimes even multi cloud workloads. And that inherent nature of the beast is why I use sort of the term of connected data architecture is you need an architecture that inherently is built to span that fact. And that's just increasing. That's just the world we live in today. But the fact is because there are speed of light issues, there's data fidelity issues, there's other types of things. How are you starting to see those practical and very physical reality start to impact the whole concept of design as it pertains to data? As it pertains to analytics, as it pertains to the infrastructure associated with the two? Yep, so at Hadoop Summit that we had last June, there were really some really good sessions that were there, folks like Comcast, Ford, Schlumberger talked about this, connected data architecture reality, right? If you look at like, I like to use the connected car ecosystem as a good example, because there were insurance providers and others that were sort of speaking on behalf of that. Where you have the cars and other data that's inherently born up there. And there's a slug of use cases that are around edge analytics, streaming analytics, time series analytics. And we're seeing that and I think the cloud lends itself really well for those types of use cases. But we also see manufacturing line data for the cars where you want to get a 360 degree review of operational issues and dovetail that with manufacturing line elements. And that's inherently what we've seen is what your classic sort of on-prem data lake in quotes has been used for. So you can get that 360 degree operational intelligence type analytics to come out of that, right? So that type of use case, whether you apply it to oil and gas and having the sensors on the oil rigs in the Schlumberger example, that pattern is repeating itself across different industries. British Gas in Europe talks about how they're fundamentally changing the nature of their relationship with their customer because of the smart meters and their connectivity in the homes and they can deliver better value there. So that's inherently a connected data realm. There's cloud use cases and in the data center use cases. And so I see these use cases, there will be use case specific and applications that are sprinkled across that fabric, if you will, right? And that's really what we're seeing. At our panel last year here in this venue, we were talking about a lot of things. One was the market, the sort of ebbs and flows you just mentioned. You guys are the only public player, talons joining that crew. Excellent. You've seen some. We need more. We need more. We've seen some M&A, platform taken out. I don't know if that was, I don't know the specifics of that deal. It might have been an AccuHire, I might not, I don't know. And a Datamere did a raise. So you're seeing these rip currents in all directions. What are you seeing in the marketplace? A lot of funding early on, a lot of players, a lot of innovation. And now it's like, okay, the music at some point is going to stop. What's your take? Yeah, so in our last call, and I think we repeated it on prior earnings calls, our focus, and then we put out there and earnings in our Q3 earnings we'll sort of reiterate where we stand is we basically said Q4 is when we look to go adjust the EBITDA breakeven and then 2017 we'll go from there. We reiterated that guidance. We had a little over 62 million in billings for the quarter. So the business is pretty robust and growing. It's a, we're only five years into this. I mean, we're just five years old. So it's a very fast pace of billing's growth. That's almost a 250 million run rate for exiting that quarter annual run rate. So we see a lot of the use cases really continuing to move on. I think what our customers ask us is they're on a digital transformation journey and they want the industry to start talking about those types of business value drivers. So I think we should expect to see a transition from the piece parts animals in the zoo and what's the right open source piece of technology and more why should you care as a business? How is this transforming what you do? How does this open up new lines of business? We started seeing that at Hadoop Summit when I think about two dozen customers were sharing very rich stories. So that's where things are. But I think running a company is you have to, you have to run it with a certain sense of rigor and that was one of the reasons why we chose to go public, right? So we, by the way, we totally agree that customers want to stop talking about digital business in platitudes and start actually identifying specifically what is it about it that's new and different and find ways of doing it. Sure. Coming back to the issue however of how you go about making some of those transformations relevant, there is clearly a knowledge gap about what digital business is, what it isn't, certainly. But there's also a fair amount of skills that have yet to be developed that are required for a lot of the use cases that companies are pursuing, not just in terms of implementing the technology appropriately but actually constructing and conceptualizing the use cases. Sure. So that suggests that there's two paths forward. There's a path forward where we can do a better job of diffusing knowledge through people and there's a path forward where we can do a better job of building software that's easier to use. And there's both. How do you see this playing out over the course of the next few years? Yeah, and I think in any new area as technology is emerging, like one of the things I use is Apache Software Foundation. Literally every other week there's a new data related Apache project that lands. So it could be really confusing but it's exhilarating from the fact of I participate in that and I try and figure out what ones we can harness in a consumable platform whether it's on-prem or cloud or what have you. What use cases can it light up? So I think you have both of those vectors and it really depends on, I like to use the classic software adoption curve. You have a lot of the left side of the chasm folks where a lot of this new stuff is going to be sharper edges and they're always going to be trailblazers, right? But we are also seeing a lot of, some of these advanced analytics, some of these new solutions are automating the pipeline. So you can actually let the infrastructure and these engines do more of the thinking for you so you get your models output. Even the point where you run multimodal simulation in parallel and out pops the best fit. That's where things will head, right? I think it's just a matter of the technology maturing, making sure we address things like security, metadata management, governance and those illities that the enterprise expects and then really forcing ourselves to simplify and automate as much as possible, right? And that was one of the reasons on that last one why in October 2011 we basically chose Teradata and Microsoft as key partners. Teradata because in 2011, clearly, right? They're, you know, they're Teradata, right? Microsoft because it simplifies technologies and brings them to billions of users, right? And so we need to do both. You need to harden it for the most rigorous, large enterprises, but you need to simplify it for the meat of the market adopters, right? The early majority and late majority. You have to do both. You're sitting across from a CEO and you have to say, these are the three things you need to do to enact this digital transformation. What are the three things you're telling them? So, you know, I think they need as a business to identify how do they want to leverage data as capital and what pockets of value do they want to go chase? Number one. Number two, how is their business being impacted by the fact that you have the rise of IoT and an inherently increasingly connected society and infrastructure? How is that impacting them? And number three is how do they evolve what they're used to doing, right? The new practices. Exactly. Because that's really in many respects, I like to say there's a difference between invention and innovation. Invention is the engineering act. Innovation is a social act. It's adopting those new practices that actually allow you to enact the invention and generate value. Exactly. Now, in our space, we, I think we have a very compelling, you know, renovate value prop, which is a cost savings where you can drive costs out. But the innovate use cases are the ones, it was like if all you're going to do is renovate, then you will fail. You will stall, right? Because it's not about cost savings. It's about how do you actually transform your business. And in the case of like the British Gas example, I use that as how they engage that end consumer is fundamentally changing, right? And so that's the question I put back in those conversations is how do you want to evolve your business and how do you leverage data as capital? Because the beauty of data as capital is you can actually generate multiple lines of interest off of a single data set, because you can derive different insights off of that. So it's not like a dollar, right? And single compound, it's multiple compound annual interest rate on that. But they have to chase the right use cases. Although we've also learned from great design that if you do the right thing better, you get rid of a lot of waste. And so coming back to your point, doing the right thing better often leads to cost savings. Yes, exactly. One inherently can drive the other. But if you're just driving it, then you're not going to transform your business. You're just going to continue to do the same or the wrong things worse. Exactly. Or want these cheaper. And that's, you know, that's difficult for enterprises because there's a certain way to do data management inherently inside in a highly structured manner. But I do think the rise of, like IOT I don't see as a market. I see it as infinite slices of prosciutto, right? It's a very thinly sliced set of market opportunities, right? But it's forcing people to think about different use cases and how that might impact their business. We see this set of capabilities, which leads to the prosciutto. Exactly. And it come up with a really nice sandwich. That's my Italian. Let's keep going. Good, hey, I'm loving it. I'm getting a little hungry. You have always made a big deal out of your partnerships, not being Barney deals, but being deep integration relationships. So you mentioned two here, Teradata and Microsoft. As the cloud becomes more prevalent, as things evolve and machine learning becomes the hot buzzword, et cetera. How have you evolved those relationships, specifically in terms of the integration work that you've done? Have you kept up that engineering ethos? And that was the thing, like with Microsoft, we could spend a lot of sweat equity on the Azure HD Insight Service, but if you look at that ecosystem, they have Azure Machine Learning, right? They have a whole raft of services, right, that you can apply to the data when it's in the cloud, right? So how that piece integrates with the broader ecosystem of services is a lot of engineering work as well. I've always said there's work to be done in our green box, but the other half of the work is how it plums into the rest. And so if you look at the AWS ecosystem, how do you optimize for S3 as the storage tier and ephemeral workloads, where HDFS is maybe a caching mechanism, but it's not your primary storage, right? It brings up really interesting integration modes and how you actually bring your value out into really interesting use cases, right? So I think it's opened up a lot of areas where we can drive a lot more integration, drive the open source tech in a way that's relevant for those use cases. All right, we gotta go, but Summit in Tokyo, is it next month? Yes, end of October. It's our first time, so primarily summits have been US and Europe. We had Melbourne, end of August, and we have Tokyo, end of October. I'll be, they're bringing the right hander out of retirement, so I'll be on stage in Tokyo. I've usually been behind the scenes on the last one. Throwing the slurv. Yeah, exactly, so I'm looking forward to it. It'll be exciting. All right, good, and then 17. You're going to start again in the spring. You're in Munich, right? We were in Dublin last year. You're moving to Munich this year. Hopefully the Cube will be back in Munich, all right? We love you guys. Let's make it happen. You guys do a good job. Good stuff in Europe, so thanks again for coming with us. Thanks for having me. Always a pleasure. All right, keep it right there. We'll be back right after this short break. This is theCUBE, we're live from New York City.