 Live from San Francisco, California, it's theCUBE. Covering the IBM Chief Data Officer Summit, brought to you by IBM. We're back at San Francisco, we're here at Fisherman's Wharf covering the IBM Chief Data Officer event, hashtag IBM CDO, this is the 10th year of this event. They tend to bookend them both in San Francisco and in Boston and you're watching theCUBE, the leader in live tech coverage. My name is Dave Vellante, John Thomas is here, CUBE alum and distinguished engineer, director of analytics at IBM. And somebody who provides technical direction to the data science elite team, John, good to see you again. Steve Alliuk is back. He is the vice president of deep learning in the global chief data office. Thanks for coming on again. No problem. All right, let's get into it. So John, you and I have talked over the years at this event. What's new these days, what are you working on? So Dave, I'm still working with clients on implementing data science and AI use cases and mostly enterprise clients and seeing a variety of different things developing in that space. Things have moved into broader discussions around AI and how to actually get value out of it. Okay, so I know one of the things that you've talked about is operationalizing machine intelligence and AI and cognitive. And that's always a challenge, right? Sounds good, we see this potential, but unless you change the operating model, you're not going to get the type of business value. So how do you operationalize AI? Yeah, this is a good question, Dave. So enterprises, many of them are beginning to realize that it is not enough to focus on just the coding and development of the models, right? So they can hire super talented Python TensorFlow programmers and get the model building done, but there is no value in it till these models actually are operationalized in the context of the business, right? So one aspect of this is, actually we know, we are thinking of this in a very systematic way and talking about this in a prescriptive way, right? So you got to scope your use cases out, you got to understand what is involved in implementing that use case, then the steps of build, run, manage, and each of these have technical aspects and business aspects around. Okay. So most people jump right into that build aspect, which is writing the code. Yeah, which is great, but once you build the models by writing code, how do you actually deploy these models? Whether that is for online invocation or batch scoring or whatever, how do you manage the performance of these models over time? How do you retrain these models? And most importantly, when these models are in production, how do I actually understand the business metrics around them? Because this goes back to that first step of scoping. What are the business KPIs that the line of business cares about? The data scientist talks about data science metrics, precision and recall, and area under the rock curve and accuracy and so on. But how do these relate to business KPIs? All right, so we're going to get into each of those steps in a moment, but Steve, I want to ask you, so part of your charter, Interpol Global Chief Data Officer, you guys have to do this for IBM, right? Drink your own champagne, dog footing, whatever you call it, but there's real business reasons for you to do that. So how is IBM operationalizing AI? What kind of learnings can you share? Well, the beauty is I got a wide portfolio of products that I can pull from, so that's nice. Like things like AI opens field, Watson, some of the hardware components, all that stuff's kind of being baked in. But this part of the reason that John and I want to do this interview together is because what he's producing, what his thoughts are, kind of resonates very well for our own practices internally. We got so many enterprise use cases. How are we deciding which ones to work on, which ones have the data, potentially which ones have the biggest business impact, all those KPIs, et cetera, also, in addition to, for the practitioners, once we decide on a specific enterprise use case to work on, when have they reached the level where the enterprise is having a return on investment? They don't need to keep refining and refining and refining, or maybe they do, but they don't know these practitioners. So we have to clearly justify it and scope it accordingly, or these practitioners are left into this kind of limbo where they're producing things, but not able to iterate effectively for the business. So that process is a big problem I'm facing internally. We got hundreds of internal use cases, and we're trying to iterate through them. There's an immense amount of scoping, understanding, et cetera. But at the same time, we're building more and more technical debt as the process evolves, being able to move from project to project. My team is ballooning. We can't do this. We can't keep growing. They're not going to give me another hundred head down, another hundred head down. So we definitely need to manage it more appropriately, and that's where this mentality comes in. All right, so I got a lot of questions. I want to start unpacking this stuff. So the scope piece, we're setting goals, identifying the metrics, success metrics, KPIs, and the like. Okay, reasonable starting point, but then you go into this, I think you call it the explore or understanding phase. What's that all about? Is that where governance comes in? That's exactly where governance comes in, right? So because it is, you all know the expression, garbage and garbage out. If you don't know what data you're working with for your machine learning and deep learning enterprise projects, you will not have the results that you want, right? And you might think this is obvious, but in an enterprise setting, understanding where the data comes from, who owns the data, who worked on the data, the lineage of that data, who is allowed access to the data, policies and rules around that is all important because without all of these things in place, the models will be questioned later on, and the value of the models will not be realized, right? So that part of exploration or understanding, whatever you want to call it, is about understanding data that has to be used by the ML process, but then at a point in time, the models themselves need to be cataloged, need to be published, because the business as a whole needs to understand what models have been produced out of this data, right? So who built these models? Just as you have lineage of data, you need lineage of models. You need to understand what APIs are associated with the models that are being produced. What are the business KPIs that are linked to model metrics? So all of that is part of this understand and explore part. Okay, and then you go to build, I think people understand that, as everybody wants to start there, just start with dessert. All right, and then you get into the sort of run and manage these. Run, you want a time to value, and then when you get to the management phase, you really want it to be efficient, cost effective, and then iterative. Okay, so here's the hard question here. What you just described, some of the folks, particularly the builders are going to say, ah, such a waterfall approach. Just start coding. Remember 15 years ago, I was like, okay, how do we write better software? Just start building. Don't forget about the requirements. Just start writing code. Okay, but then what happens is you have to bolt on governance and security and everything else. So talk about how you are able to maintain agility in this model. Yeah, I was going to use a word agile, right? So even in each of these phases, it is an agile approach, right? So the mindset is about agile sprints, you know, two week long sprints with very specific metrics at the end of each sprint that is validated against the line of business requirements, right? So although it might sound waterfall, you're actually taking an agile approach to each of these steps. And if you are going through this, you have also the option to code scratch as it goes along. Because think of this, the first step was scoping. The line of business gave you a bunch of business metrics or business KPIs they care about, right? But somewhere in the build phase past sprint one or sprint two, you realize, oh, well, you know what? That business KPI is not directly achievable or it needs to be refined or tweaked. And there is that circle back with the line of business and the course correction as it goes. It's a very agile approach that you have to take. Are they, that's, I think right on. Because again, if you go and bolt on compliance and governance and security after the fact, we know from years of experience that it really doesn't work well. You build up technical debt faster. But are these quasi parallel? I mean, there's some things that you can do and build as the scoping is going on. Is there collaboration? So you can describe that a little bit. So for example, if I know the domain of the problem, I can actually get started with templates that help me accelerate the build process, right? So I think in your group, for example, IBM internally or many, many templates these guys are using, want to talk a little bit about that? Well, we can't just start building up every single time. That's, again, I'm going to use this word and really resonate it. It's not extensible. Each project, we have to get to the point of using templates. So we had to look at those initiatives and invest in those initiatives because initially it's harder. But at least once we have some of those cookie cutter templates and some of them, they might have to have abstractions around certain parts of them. But that's the only way we're ever able to kind of tackle so many problems. So no, without a doubt, it's an important consideration. But at the same time, you have to appreciate there's a lot of projects that are just fundamentally different. And that's when you have to have very senior people kind of looking at how to abstract those templates to make them reusable and consumable by others. But the team structure, it's not a single amoeba going through all these steps, right? These are smaller teams that are, and then there's some threading between each step? Or? This is important because, yeah. We're just talking about that concept. Just talking about skills. The buy-in between those groups is something that we're trying to figure out how to break down. Because that's something he recognizes, I recognize internally. But understanding that those people's tasks, they're never going to be able to iterate through different enterprise problems unless they break down those borders and really invest in the communication and building those tools. Exactly, exactly. You talk about full stack teams, right? It is not enough to have coding skills, obviously. What is the skill needed to get this into a run environment, right? What is the skill needed to take metrics like, not metrics, but explainability, fairness in the model, et cetera, and map that to business metrics. That's a very different skill from Python coding skills, right? So full stack teams are important. And at the beginning of this process where someone, line of business throws a hundred different ideas at you and you have to go through the scoping exercise, no, that is a very specific skill that is needed, working together with your coders and runtime administrators, right? Because how do you define the business KPS and how do you refine them later on in the lifecycle? How do you translate between line of business lingo and what the coders are going to code in? So this is, it's a full stack team concept. It may not necessarily all be in one group. It may be, but they have to work together across these different silos to make it successful. All right guys, we got to leave it there. The trains are backing up here at the IBM CDO conference. Thanks so much for sharing the perspectives on this. All right, keep it right there, buddy. You're watching theCUBE from San Francisco. We're here at Fisherman's Wharf, the IBM Chief Data Officer event. Right back.