 Live from San Jose in the heart of Silicon Valley, it's theCUBE, covering DataWorks Summit 2017, brought to you by Hortonworks. Welcome back to theCUBE. I'm Lisa Martin with my co-host, George Gilbert. We are live on day one of the DataWorks Summit in San Jose in the heart of Silicon Valley. Great buzz in the event that I'm sure you can see in here behind us. We're very excited to be joined by a couple of fellows from IBM, a very longstanding Hortonworks partner that announced a phenomenal suite of four new levels of the partnership today. Please welcome Assad Mahmoud, analytics cloud solution specialist at IBM and medical doctor, and Linton Moore, distinguished engineer, power systems open power solution from IBM. Welcome guys. Welcome. Great to have you both on the queue for the first time. So, Linton software has been changing companies, enterprises all around are really looking for more open solutions, really moving away from proprietary. Talk to us about the open power foundation before we get into the announcements today. What was the genesis of that? Okay, sure. We recognize the need for innovation beyond a single chip to build out an ecosystem, an innovation collaboration with our system partners. So, ranging from Google to Melanox for networking to Hortonworks for software, we believe that that system level optimization and innovation is what's going to bring the price performance advantage in the future. That traditional CMOS scaling doesn't really bring us there by itself, but that partnership does. So, from today's announcements, a number of announcements that Hortonworks is adopting IBM's data science platform. So really the theme this morning of the keynote was data science, right? It's the next leg and really transforming an enterprise to be very much data driven and digitalized. We also saw the announcement about Atlas for data governance. What does that mean from your perspective on the engineering side? Very exciting. In terms of building out solutions of hardware and software, the ability to really harden the Hortonworks data platform with servers and storage and networking, I think it's going to bring simplification to on premise as like people are seeing with the cloud. I think the ability to create the analyst workbench or the cognitive workbench using the data science experience to create a pipeline of data flow and analytic flow, I think it's going to be very strong for innovation. Around that, most notable for me is the fact that they're all built on open technologies, leveraging communities that universities can pick up, contribute to. I think we're going to see the pace of innovation really pick up. And on that front, on pace of innovation, you talked about universities. One of the things I thought was really a great highlight in the customer panel this morning that Raj Varma hosted was you had healthcare, insurance companies, financial services, there was Duke Energy there. And they all talked about one of the great benefits of open source is that kids in universities have access to the software for free. So from a talent attraction perspective, they're really kind of fostering that next generation who will be able to take this to the next level, which I think is a really important point as we look at data science being kind of the next big driver, transformer, and also going there's not a lot of really skilled data scientists. How can that change over time? And this is one, the open source community that Hortonworks has been very dedicated to since the beginning. It's a great, it's really a great outcome of that. Definitely. I think the ability to take the risk out of a new analytic project is one benefit and the other benefit is there's a tremendous, not just from young people, a tremendous amount of interest among programmers, developers of all types to create data science skills, data engineering and data science skills. If we leave aside the skills for a moment and focus on sort of the operationalization of the models once they're built, how should we think about a trained model or I should break it into two pieces. How should we think about training the models where the data comes from and who does it and then the orchestration and deployment of them, cloud, edge, gateway, edge device, that sort of thing. I think it all comes down to exactly what your use case is. You have to identify what use case you're trying to tackle, whether that's applicable to clinical medicine or whether that's applicable to finance, to banking, to retail transportation. First you have to have that use case in mind. Then you can go about training that model, developing that model and for that you need to have a good, potent, robust data set to allow you to carry out that analysis and whether you want to do exploratory analysis or you want to do predictive analysis, that needs to be very well defined in your training stage. Once you have that model developed, then we have certain services such as Watson Machine Learning within data science experience that'll allow you to take that model that you just developed just moments ago and just deploy that as a RESTful API that you can then embed into an application into your solution and then that solution you can basically use across industry. Are there some use cases where you have almost like a tiering of models where there's some that are right at the edge like a big device, like a car. And then there's sort of the fog level, which is the, say, cell towers or other buildings nearby. And then there's something in the cloud that's sort of like master model or an ensemble of models. How, you know, I don't assume that's like evil can evil would say, you know, don't try that at home, but sort of is the tooling being built to enable that? So the tooling is already in existence right now. You can actually go ahead right now and be able to build up prototypes, even full level, full range applications right on the cloud and you can do that. You can do that thanks to data science experience. You can do that thanks to IBM, Lumix. You can go ahead and do that type of analysis right there. And then not only that, you can allow that analysis to actually guide you along the path in building a model to building a full range application. And this is all happening on the cloud level. We can talk more about it happening on the on-premise level, but on the cloud level specifically you can have those applications built on the fly, on the cloud and have them deployed for web apps, for mobile apps, et cetera. One of the things that you talked about is use cases in certain verticals. IBM has been very strong and vertically focused for a very long time, but you kind of almost answered the question, but I'd like to maybe explore it a little bit more about building these models, training the models in say healthcare or telco and then being able to deploy them. Where's the horizontal benefits there that IBM would be able to deliver faster to other industries? Definitely. I think the main thing is that IBM first of all gives you that opportunity, that platform to say that hey, okay, you have a data set, you have a use case, let's give you the tooling, let's give you the methodology to take you from data to a model to ultimately that full range application. And specifically I've built some applications specific to federal healthcare, specifically to address clinical medicine and behavioral medicine, and that's allowed me to actually use IBM tools and some open source technologies as well to actually go out and build those applications on the fly. As a prototype to show not only the realm, the art of the possible when it comes to these technologies, but also to solve problems, because ultimately that's what we're trying to accomplish here. We're trying to find real world solutions and real world problems. Linton, let me redirect something towards you about, a lot of people are talking about how Moore's Law is slowing down or even ending, well at least in terms of speed of processors. But if you look at the, not just the CPU, but FPGA or ASIC or the Tensor Processing Unit, which I assume is an ASIC, and you have the high speed interconnects, if we don't look at just what can you fit on one chip, but you look at 3D, what's the density of transistors in a rack or in a data center, is that still growing as fast or faster? And what does it mean for the types of models that we can build? That's a great question. One of the key things that we did with the Open Power Foundation is to open up the interfaces to the chip. So with NVIDIA, we have NVLink, which gives us a substantial increase in bandwidth. We have created something called OpenCappy, which is a coherent protocol to get to other types of accelerators. So we believe that hybrid computing in that form, you saw NVIDIA on stage this morning, and we believe especially for deep learning, the acceleration provided from GPUs is going to continue to drive substantial growth. It's a very exciting time. Would it be fair to say that we're on the same curve? If we look at it not from the point of view of, what can we fit on a little square, but if we look at it, what can we fit in a data center or the power available to model things? Jeff Dean at Google said, hey, if Android users talk into their phones for two or three minutes a day, we need two to three times the data centers we have. What can we grow that price performance faster and enable sort of things that we did not expect? I think the innovation that you're describing will in fact put pressure on data centers. The ability to collect data from autonomous vehicles or other endpoints is really going up. So we're okay for the near term, but at some point we will have to start looking at other technologies to continue that growth. Right now we're in the throes of what I call fast data versus slow data. So keeping the slow data cheaply and getting the fast data closer to the compute is a very big deal for us. So NAND flash and other non-volatile technologies for the fast data are where the innovation is happening right now. But you're right, over time we will continue to collect more and more data and it will put pressure on the overall technologies. Last question as we get ready to up here. I thought your background is fascinating to me having a medical degree and working in federal healthcare for IBM, you talked about some of the clinical work that you're doing and the models that you're helping to build. What are some of the mission critical needs that you're seeing in healthcare today that are really kind of driving not just healthcare organizations to do big data right, but to do data science right? Exactly, so I think one of the biggest questions that we get and one of the biggest needs that we get from the healthcare arena is patient-centric solutions. There are a lot of solutions that are hoping to address problems that are being faced by physicians on a day-to-day level, but there are not enough applications that are addressing the concerns that are the pain points that patients aren't facing on a daily basis. So the applications that I've started building out at IBM are all patient-centric applications that basically put the level of their data, their symptoms, their diagnosis in their hands alone and allows them to actually find out, okay, more or less what's going on wrong with my body at any particular time during the day and then find the right healthcare professional, the right doctor that's best suited to treating that condition, treating that diagnosis. So I think that's the big thing that we've seen from the healthcare market right now. The big need that we have that we're currently addressing with our cloud analytics technology, which is just becoming more and more advanced and sophisticated and it's trending towards more of some of the other health trends or technology trends that we have currently right now on the market, including the blockchain, which is tending towards more of a decentralized focus on these applications. So it's actually putting more of the data in the hands of the consumer, of the hands of the patient and even in the hands of the doctor. Wow, fantastic. Well, you guys, thank you so much for joining us on theCUBE. Congratulations on your first time being on the show, Asad Mahmood and Lyndon Ward from IBM. We appreciate your time. Thank you very much. And for my co-host, George Gilbert, I'm Lisa Martin. You're watching theCUBE live on day one of the DataWorks Summit from Silicon Valley, but stick around. We've got great guests coming up, so we'll be right back.