 Live from San Francisco, it's theCUBE. Covering IBM Think 2019, brought to you by IBM. Good afternoon, welcome back to theCUBE's continuing coverage of IBM Think 2019. I'm Lisa Martin in Saugie, San Francisco, with Dave Vellante. Hey Dave. Hey Lisa, we're staying dry though, for the most part. Exactly, there are, there are, looks like the Moscone North is maybe having a few little areas of improvement that they've just running water through the pipes, as we would say. Just a little trial run. That's true, that's true. So we're welcoming back to theCUBE, a guest that hasn't been with us for a while, Paul Zocopoulos, Vice President of Big Data Cognitive Systems at IBM. Paul, welcome back. Oh, thank you, and thanks for getting my name right. That was good, very good. Oh, you're welcome. Phew! So, you are an accomplished author. I stalked you on Twitter. 19 books, over 350 articles. I know you do a lot of speaking. You've been at IBM a long time. This event's massive, right? 30,000 people or so yesterday was standing here only. In fact, they shut the doors to Judy's keynote because there were so many people. I'm curious, some of the announcements that came out with Cognitive yesterday, what are some of, last day, what are some of the things that you saw yesterday that kind of piqued your interest? Well, the Watson anywhere was, personally I've said that's a long time coming. I mean, come on, we've got to have Watson on any cloud, right? Not just the IBM cloud. So that was, I thought, a big deal. And then there were a bunch of announcements around enabling hybrid. I think there were 20 plus services. So, you know, kind of vogue. You know, we're in this multi-cloud world. I need a way to get to hybrid. So, those are two standouts. So your group's been busy, basically. That's right, that's right. I mean, you hit it, right? Watson anywhere, cloud everywhere. So it's about AI and that journey. I have to tell you that, when I hear all the announcements, there are tons of them, right? One of my favorite ones, probably doesn't go as noticed. And it was Watson Machine Learning Accelerator. And that is really about looking at the journey for AI and clients over the next course of the years on that journey. See, most clients are just getting started. There's some clients in the middle phase, and there's some clients now that are hitting what I call the enterprise-worthiness stage of AI, right? And so when we look at our announcements, they're actually taking you from just getting started all the way to enterprise-hardened, explainable, interpretable algorithms, and how to manage that. Because we're going to go from this world where AI is sitting in the corner offices for the privileged few, and we have to democratize for the many. But today, it's like, here's a little data science team, they have their own server. Here's an R programmer on their laptop, you know, and hanging out working there. But we want to bring this all together for enterprise. So things like workload management, which is what Watson Machine Learning Accelerator really does, is how do I get everything together and working in a concurrent environment as organizations go from having 10, 20 algorithms to trying to deploy thousands of them. That's how they'll define themselves. Well, you know, when you get a bunch of data scientists in the room and you talk about citizens data scientists, they kind of look at you like there's no such thing. But the fact is that if you can operationalize AI, you can open it up to a lot more people. You know, as a line of business person, you'd much rather not have to go to a data scientist every time you want to do something with AI, because otherwise you're just kind of repeating the old decision support world. So what are you guys doing to operationalize AI? So it's a great question. We're taking the friction. And so a lot of people come and say, oh, GPU acceleration. So yeah, it's about training stuff faster. It's an open architecture and power. And so you've seen the work with NVIDIA and that's unique to what NVIDIA can do with our cognitive systems is to accelerate the CPU-GPU communications. But there's a broader pipeline when you go to this AI journey and we want to flatten that curve. So one is how to get up and running. And I don't know if you remember, open source changes all the time. So we're enterprise hardening, back testing, getting you ready for here's the platform to deploy, built on open source. And where 80% of a data scientist time is spent right now is in what I call data preparation. Wrangling data, labeling data, get stuff together. Now none of that is data science. It's like none of that is data science at all. And that's where the time. And once I get the data ready, I train the model. Okay, so you've heard a lot about that. And then the next thing I do is have to optimize the model. So I think about where data scientists should be spending their time. And that's on stage four. And we call that exploring the hyperparameter space. Another thing that Watson Machine Learning Accelerator is all about, how do we make the model perform? Now for data science geeks perform means how well is it classifying or how accurate it is. Hardware people often think performance means how fast you go, right? And then finally you go to inference. So we're looking at all five of those stages. And one of them, the biggest one is at 80% sync time. We're trying to drop that to 20% and open it up for the rest of the enterprise. So how do you democratize AI? You mentioned that a lot of enterprises are really at the beginning of that journey. But when you're out talking with customers, is there some sort of paralysis there where they're like, Paul, where do we start? Right, right. I think there's two areas where I see inertia or friction. So one is where do we start? So let me say that. Start with the data you have. You don't have to step up to the plate and hit a home run. You just get started and it's the things, the little things you do every day, not the big things you do once in a while. And we always hear about disruption. Disruption, you hear about Uber and Airbnb as the disruptors. I actually believe they were the disruptors of yesterday. I think right now, we're in this list shift, rift or cliff moment. The disruptors of tomorrow will be those at the head of the analytics renaissance that work with the data they have. We know the outcomes, we call that supervised learning and that's where you get started. And then the other piece is, how do I get more people to participate? Talk about the lift, shift, rift or cliff intersection. I saw that you've talked about that on social media. Can you break that down a little bit more and also talk to us about how you're helping customers actually kind of break through that or maybe it's avoid that altogether? Yeah, well, I mean, you want to take two of those four and not take the other two, right? And I think that we do this lift, shift, cliff moment in two ways. One is as individuals. So the people in the audience, the people watching here, all of us as practitioners, we have got to get our skills moving forward. I always say skill years are like dog years, right? Like they age instantly. And so you should be waking up every day like a newbie in this world and learning every single day. And if you do that, you'll have nothing to worry about as an individual. And as organizations, you had better put analytics at the forefront. That means from the boardroom, that means we encourage the culture of analytics everywhere. And so that's what I mean by lift, shift, rift or cliff moment. So it comes back to sort of opening it up for average everyday line of business people. You got to demo. I'm trying to see what you got here. Can I show it to you? Yeah, please. All right, so you were talking about the data scientists and citizen data scientists. So I'm going to propose to you this thing I call the wisdom of the crowd, right? Today, data scientists have to build things they're not domain experts. Imagine if I could invite the many to participate in this storyline. And in the storyline, everyday line of business people could create an application based on an idea or a model, and maybe we'd have thousands of them. And out of those thousands, we might vet, I don't know, 50 or 100. And out of that, we would team up with data science, deploy 10 or 20 into production and then do the whole thing over again. So let me show you how I could create this application here without building a single line of code. And I actually use you, Dave, as an example, because I wanted to see how much face time you get on the cube when John is up here with you doing. Oh, I don't need AI to tell you that. Oh, you don't? I can give you the answer. So I had this. I got the short end of the stick, and I'm not one to pull it. Well, we'll let the data tell the truth. Okay, let's see what happens. Yeah, that's data driven, right? So I had this intuition as a line of business user and I went to explore this. So you can see here that we'll have two videos here. And on the first video, you see where I put this here? We'll say host screen time. That's actually going to measure the amount of time that you're on screen. And I would build that. And I actually built that in this modern way that democratizes for the many. I'll just start it out here. And on the bottom, I built it the old fashioned way. So you can see we got John in there and they start out pretty good to start, right? They're both recognizing both of them. So let me go and pause these. Now, the first thing you should notice is I've got a timer on the bottom. I got a timer on the bottom because I actually had time to build that. My DevOps team kind of put that in there for me. So we'll continue this. Move it over here and let these things run. Now, look at the accuracy of these models. Do you notice on the top, you guys are both identified increasing the screen counter and on the bottom, I can't see you. So computer vision is very interesting. If I wanted to teach a computer to tell me what the number eight was, I could show it a picture of an eight. But the moment I moved it sideways, it would have no idea what it was. I need to train it with lots and lots of data. And so the bottom is the way the data scientists work. So what did I have to do to do that? I had to go collect some video, had to reformat it, had to put it down to a 480 and I had to write some code far away. And you see the code there? Now, in order to get just to MVP, so this model clearly doesn't score well. Dave turns his head and it doesn't know who it is anymore. All I said is you're Dave Valente and if you're not, then you're John. So what do you do if you've got a third person in there? All right, and this is where we democratize it. So this is our PowerAI vision. We've been talking a lot about this and I want to kind of invite everybody to take part in this kind of data science renaissance. All you do is you would go and upload some video here and you go capture some frames. We could auto capture those frames every five seconds. And let's say I wanted to add a new person like Arvin into this list here. And so I want to go develop and figure out how the algorithm can find out Arvin is. Now in my last demo I showed you that was a linear classifier, that wasn't easy. Here we'll go type in Arvin, add Arvin. And then I'm just going to highlight it and box Arvin. And now I've started to train the model. There's no code at all. You just trained the model. You just said this is Arvin, when you see this. So I'm labeling the model and then I'd have to go set it off to training. And now look, I'll do one other thing for you here. I'll go and say, well here's the think logo. And maybe I want to track some logo detection. That's it, that's how I built the model. Now it's all about how much supervised label data you have. So that's why I said who are the disruptors of the future. And that's all about the compute power and the workload management power to train this stuff. So Cognitive Systems is really all about both. So we obviously know about the power and the workload management. How do I go and actually generate the data? So once I train this model, I could click auto label. It'll actually go through the rest of the video and go and find out from what it saw. But here's where things get beautiful. And everything I've showed you is someone writing lines of code, now replaced with a clicker. So I click augment data. We call these morphological operations. I want you to notice something. We have 119 images labeled of Dave and John. So as I click here, I'm going to apply these morphological operations, Gaussian blur, sharpening blur. That all means stuff to data scientists. Now I have 4,249 data points. And I will generate that automatically. That's all driven by line of business. And finally, we can come over here and go actually look at the model. Here's my model. This model is actually scoring pretty really well. But even if it wasn't scoring well, I'm at 70%. This is now when I pass it to the data scientist team to do what they're exceptional at, the tuning, the hyperparameter tuning for the performance score of the algorithm. And so here, I'll just finish this off by, I think I had a picture of you. I'll just drag it in here. And now it's actually going out and scoring it. We're scoring at 96% accuracy. And I can expose this as a rest of API with the click of a button. So I just have one thing though I found out with the AI for you, Dave. At the end of it, from what I can see, John is getting about 50% more screen time than you. And if- That's all, that's pretty good actually. Yeah, oh, you thought it was worse. And if you notice your name here is Dave Dapper-Velante. Cause we can't help but notice, we can't help but notice how well-dressed you are. You need a data scientist to put that in. You're well-dressed, it is pretty accurate. But you're not getting the ROI on those outfits that you need for screen time. That's what we found with AI. It's tough with my business partner, John. But that's pretty cool. Now, you're saying you wrote the code, right? To identify either John or Dave. And at what point did you bring the data scientist in? Yeah, so I didn't write any code on the top. On the bottom, which the model did not perform well. When you turn it inside, we couldn't see you. That's the code we wrote. And that would take iterations, iterations. There was no code written there. We built the model. And then we brought a dev person in to try to build us a timer. Which was a couple lines of code. Took him about half an hour. And in this case, I didn't really bring the data scientist in yet. Because I'm scoring it 96%. But I can easily pass it on in the workflow. And that's the story. It's a pipeline workflow across. So I'll pull the data scientist in when I need to. But 96% accuracy without a data scientist presence is pretty good. So a more complex use case, you might not get 96% accuracy. You might be at 50%, 40%, 70%. Now you bring the data scientist in for the last mile, is that right? Absolutely. This is only scoring 50%. And you don't think that's impressive. I think it's pretty impressive that I did that in half an hour. And now this is engineered from the wisdom of the crowd. I'm a line of business user. And I'd like to know what kind of screen time you're getting. Maybe that's at a sporting event. And I'd actually like a new business model where I charge Toyota by the second that they show up on the screen. That's my idea. Data scientist is never going to think that. I get it started and then they join the renaissance. That's how you democratize AI for the many. Yeah, so maybe you could talk a little bit about how, what was the compute power behind this, the infrastructure behind this? And then maybe we could talk about power and how you're applying that for AI infrastructure. Yeah, that's a great, great question. So the bottom video actually trained on my laptop. It ran for about a day and a half just so you know it was saying it is my laptop. On the top of the video, we actually leveraged our power AI architecture and ran that through with Watson Machine Learning accelerator. And I got to tell you, the models train in about 30 minutes. And in fact, we had trained a model on your last show with your last guest in the amount of time when you finished to when I came on stage. 20 minutes, yeah. So I mean, that's the accelerated compute. And it's not, and I hope what you're seeing here, this isn't just a hardware componentry story. This is a kind of coexistence in an almost synergy of software and hardware together. And that's what's needed in the AI era. Well, it's interesting. I know when you guys changed the name of the power systems group to cognitive systems, it had, you know, I inferred, of course, we got a guy running it who used to run the software business of a big software component. So it is clearly more than software. What are some of the sort of more interesting use cases that you guys are seeing with clients, specifically in terms of operationalizing this? Yeah, for sure. So in use cases of AI is, I think we're in this world of precision. So we're in precision agriculture, precision risk or underwriting, precision finance, precision retail. So the use cases are everywhere and it's really taking in all this kind of data. In the operationalizing, I think that we're helping people on all the levels. You think about it, I almost see three segments. The first segment is, we're not really sure what to do. This AI and everyone says they're doing AI, reminds me of the Hadoop days and the big data lake and you know how all that stuff turned out. So how do we get you started so you can get down the path and build kind of MVPs? And that's what I just showed you is the MVP. The next group of people are folks that have maybe one or two models deployed and now they're trying to say, how do we scale out to hundreds and thousands of models? What is the path now to make this bigger? Cause we got it moving here and then the final phase with few people are at are those who are getting the challenge of I'm getting to a thousand algorithms deployed and now how do I get all this stuff running? And so that entire path goes like this and our storyline goes across that entire path. How unique is this in the marketplace? I'm interested in your commentary on IBM's competitive advantage. Is this the only guys who can do this and why are you winning in the market? How differentiable is this? Yeah, so I think I'll answer that in two ways. One is from the brand in which I participated in a larger company called IBM. In terms of the acceleration, there's nobody doing what we're doing and the reason is you took this kind of power processor and created the open power project. And just like software evolved through open innovation, that's what hardware has done. So you look at Mellanox and NVIDIA. So I'll give you an example, Dave. The NV Link exists on Intel and exists on power but they operate in two very different ways and nobody realizes that. So NV Link accelerates GPU to GPU communications. Does that in Intel, does that on power? But because of open power, NV Link also allows the GPU to talk to the CPU. So GPUs accelerate AI training cause there's thousands of cores there, right? But they still got to talk to the CPU and on top of that they don't have much memory. So there's an example that's completely unique in the industry to make you train faster. I think our workflow model is completely unique. The tools that I showed you and around the workload management and then you look at the bigger part of IBM and how I can mix this with API calls to clouds, cloud based Watson services or local. But on top of that is now it's about how do you build the data that you can trust and how do you look at things like the explainability of the model with their Watson open scale and that kind of stuff. So it's a bigger story and nobody else has that end-to-end story. Well, and it's showing up in the results we saw last quarter that your line of business was a bright star, you know, we're seeing some momentum. Obviously there's a lot of activity going on in Linux. Clearly, you know, cognitive is a big play there. So congratulations on that, it's exciting to see. And of course maybe a lot of people might not realize that when you guys did the work to bring in little end-to-end compatibility you can run an entire software suite now. That it's not just this sort of niche proprietary platform anymore, it's mainstream. And so it's starting to show up in the business results. So that's great to see. Yeah, when I say democratized for the many I mean for the people, for the enterprise and across the entire spectrum. So. Well, Paul, thank you for confirming my suspicion here that John is my partner, John Furrier, sucking up all the camera time. John, I'm going to have to elbow my way in a little more. So appreciate that. Having the data, John's very data driven. So appreciate that. And it was good to have you on. Yeah, it was nice to see you again. All right, keep it right there, everybody. We'll be back with our next guest. We're live from IBM Think 2019. You're watching theCUBE.