 Live from San Jose in the heart of Silicon Valley, it's theCUBE, covering DataWorks Summit 2017. Brought to you by Hortonworks. Welcome back to theCUBE. We are live on day two of the DataWorks Summit. I'm Lisa Martin, hashtag DWS17. Join the conversation. We've had a great day and a half. We have learned from a ton of great influencers and leaders about really what's going on with big data, data science, how things are changing. My co-host is George Gilbert, and we're joined by my old buddy, the COO of Hortonworks, Raj Verma. Raj, it's great to have you on theCUBE. It's great to be here, Lisa. Great to see you as well, it's been a while. It has. So yesterday on the customer panel, the Raj I know had great conversation with customers from Duke Energy was one. You also had, what, Black Knight on the financial services side. And 8CSE. Yes, on the insurance side. And one of the things that, a couple things that really caught my attention. One was when Duke said, you know, kind of where they were using data and moving to Hadoop. But they're now a digital company. They're now a technology company that sells electricity and products, which I thought was fantastic. Another thing that I found really interesting about that was they all talked about the need to leverage big data and glean insights and monetize that really requires this cultural shift. So I know you love customer interactions. Talk to us about what you're seeing. Those are three great industry examples. What are you seeing? Where are customers on this sort of maturity model? Where big data and Hadoop are concerned? Sure, happy to. So, any of you are right. I have one thing that I enjoy the most about my job is meeting customers and talking to them about the art of the possible and some of the stuff that they're doing and which was only science fiction really about two or three years ago. And, you know, there are a couple of questions that you've just asked me as to where they are on their journey, what are they trying to accomplish, et cetera. I remember about, was it five, seven, 10 years ago where Mark and Reeson said that software is eating the world. And to be honest with you now, it's more like every company is a data company, you know? And I wouldn't say data is eating the world, but without effective monetization of your data assets, you can't be a force to reckon with as a company. So, that is a common theme that we are seeing, irrespective of industry, irrespective of customer, irrespective of really the size of the customer. The only thing that sort of varies is the amount and complexity of data from one company to the other. Now, when I'm new to Hortonworks, as you know, it's really my fifth month here. But one of the things that I've seen, and Lisa, as you know, I came from Tipco, so we've been dealing with data. I have been involved with data for over a decade and a half now, right? So, the difference was, 15 years ago, we were dealing with really structured data and we actually connected the structured data and gleamed insights into structured data. Now today, a seminal challenge that every CIO or chief data officer is trying to solve is how do you get actionable insights into semi-structured and unstructured data? Now, so, getting insights into that data first requires a ability to aggregate data. Once you've aggregated data, you also need a platform to make sense of data in real time that is being streamed at you. Now, once you do those two things, then you put yourself in a position to analyze that data. So in that journey, as you asked, where our customers are, some are defining their data aggregation strategy, the others having defined data aggregation, they're talking about streaming analytics as a platform, and then the others are talking about data science and machine learning and deep learning as a journey. Now, you saw the customer panel yesterday, but the one point I like to make is it's not only the Duke Energy's and the Black Knights of the world or the HCSC, who I believe are big, large firms that are using data, even a company like an old agricultural company, I shouldn't say old, but steeped in heritage is probably the right word, 96, 97-year-old agricultural company, that's in the animal feed business, animal feed, multi-billion dollar animal feed business. They use data to monetize their business model. Now, what they say is they've been feeding animals for the last 70 years, so now they go to a farmer and they have enough data about how to feed animals that they can actually tell the farmer that this hog that you have right now, which is 17 pounds, I can guarantee you that I will have him or her in a nutrition that by four months, it'll be 35 pounds. How much are you willing to pay? So even in the animal feed business, data is being used to drive not only insights, but monetization models. So. That's outstanding. Thank you. So in getting to that level of sophistication, it's not like every firm sort of has the skills and technology in place to do that. What are some of the steps that you find that they typically have to go through to get to that level of maturity? Like where do they make mistakes? Where do they find the skills to manage on-prem infrastructure if it is on-prem? What about if they're trying to do a hybrid cloud setup? I think that's where the power of the community comes through at multiple levels. So we are committed to the open source movement. We are committed to the community-based development of data. Now this community-based business model does a few things. Firstly, it keeps the innovation at the leading edge, leading edge, number one. But as you heard the panel talk about yesterday, one of the biggest benefits that our customers see of using open source is, sure, economics, it's good, but that's not the leading reason. Keeping up with innovation, very high up there. Avoiding vendor lock-in, again, very, very high up there. But one of the biggest reasons that CIOs give me for choosing open source as a business model is more to do with the fact that they can attract good talent. And without open source, you can't actually attract talent. And I can relate to that because I have a sophomore at home. And it just occurred to me that she's 15 now, but she's been using open source since she was 11. You know, the iPhone and she downloads an application for free, she uses it. And if she stretches the limit of that, then she orders something more in a paid model. So the community helps people do a few things. Be able to fail fast if they need to. The second is it lowers the barriers of entry, right? Because it's really free. You know, you can have the same model. The third is you can rely on the community for support and methodologies and best practices and lessons learned from implementations. The fourth is it's a great hiring ground in terms of bringing people in and attracting millennial talent, young talent, and sought after talent. So that's really probably the answer that I would have for that. When you talk about the business model, the open source business model and the attraction on the customer side, there sounded like there's an analogy with sort of the agribusiness customer in the sense that they're offering data along with their traditional product. If your traditional product is open source data management, what Arun started telling us this morning was the machine learning that goes along with operating not only your own sort of internal workloads, but customers and being able to offer prescriptive advice on, you know, operation, essentially IT operations. Is that the core, will that become the core of sort of value add through data for an open source business model like yours? I don't want to be speculative, but I'll probably answer it another way, right? I think our vision which was set by a founder, Rob Bearden, and he took you guys through that. Yesterday was way back when we did say that our mission in life is to manage the world's data, right? So that mission hasn't changed. And the second was we would do it as a open source community or as a big contributing part of that community and that has really not changed. Now, we feel that machine learning and data science and deep learning are areas that we are very, very excited about. Our customers are very, very excited about it. Now, the one thing that we did cover that yesterday and I think earlier today as well, I'm a computer science engineer and when I was in college, way back when, 25 years ago, I was interested in AI and ML, right? And it has existed for 50 years. The reason why it hasn't been available to the common man, so as to speak, is because of two reasons. One is it did not have a source of data that it could sit on top of that makes machine learning and AI effective or at least not a commercially viable option to do so. Now there is one. The second is the compute power required to run some of the large algorithms that really give you insights into machine learning and AI. So, we become the platform on which customers can take advantage of excellent machine learning and AI tools to get insights. Now, that is two independent sort of categories. One is the open source community providing the platform and then what tools the customers use to apply data science and machine learning, so yeah. So, all right, I'm thinking something that's slightly different and maybe the nuance is making it tough to articulate, but it's how can Hortonworks take the data platform and data science tools that you use to help understand how to operate Hortonworks, whether it's on a customer prem or in the cloud. In other words, how can you use machine learning to make it sort of a more effective and automated managed service? Yeah, and I think that the nuance is not lost to me. I think what I'm trying to sort of categorize is for that to happen, you require two things. One is data aggregator across on-prem and cloud because when you have data which is multi-tenancy, you have a lot of issues of data security, data governance, all the rest of it. Now, that is what we plan to manage for the world, so as to speak. Now, on top of that, customers who require to have data science or deep learning to be used, we provide that platform. Now, whether that is used as a service by the customer, which we'll be happy to provide, or it is used in-house, on-prem, on various cloud models, that's more a customer decision. We don't want to force that decision. However, from an art of the possible perspective, yes, it's possible. I love the mission to manage the world's data. That's a lofty goal, but yesterday's announcements with IBM were pretty transformative. In your opinion, as Chief Operating Officer, how do you see this extension of this technology and strategic partnership helping Hortonworks on the next level of managing the world's data? Absolutely, it's game-changing for us. We are very, very excited. Our colleagues are very, very, very excited about the opportunity to partner. It's also a big validation of the fact that we now have a pretty large open-source community that contributes to this cause. So we are very excited about that. The opportunities, naturally, are partnering with a leader in data science, machine learning, an AI, a company that has steeped in heritage, is known for game-changing next technology moves. And the fact that we are powering it from a data perspective is something that we are very, very excited and pleased about. The opportunities are limitless. I love that. And I know you are a game-changer. In your fifth month, we thank you so much, for us for joining us. It was great to see you. Continued success at managing the world's data and being that game-changer yourself for Hortonworks and for Hortonworks as well. Thank you, Lisa. Good to see you. You've been watching theCUBE. Again, we're live. Day two of the DataWorks Summit, hashtag DWS17 for my co-host, George Gilbert. I'm Lisa Barton. Stick around, guys. We'll be right back with more great content.