 Live from Boston, it's theCUBE. Covering IBM Chief Data Officer Summit. Brought to you by IBM. Welcome back everyone to theCUBE's live coverage of the IBM CDO Summit here in Boston, Massachusetts. I'm your host Rebecca Knight and I'm joined by co-host Paul Gillan. We have a guest today, John Thomas. He is the Distinguished Engineer and Director at IBM. Thank you so much for coming, returning to theCUBE. You're a CUBE veteran. CUBE alum. Thank you for having me on this, yep. So tell our viewers a little bit about your Distinguished Engineer. There are only 672 in all of IBM. What do you do? What is your role? Well it's a good question. Distinguished Engineer is kind of a technical executive role which is a combination of applying deep technology skills as well as helping shape IBM strategy in a technical way. Working with clients, et cetera, right? So it is a bit of a jack of all trades but also deep skills in some specific areas. And I love what I do. So we get to work with some very talented people, brilliant people in terms of shaping IBM technology and strategy, product strategy that is part of it. We also work very closely with clients in terms of how to apply that technology in the context of the client's use cases. We've heard a lot today about soft skills, the importance of organizational people skills to being a successful Chief Data Officer but there's still a technical component. How important is the technical side? What are the technical skills that the CDOs need? Oh this is a very good question Paul. So absolutely, so navigating the organizational structure is important. It's a soft skill, you're absolutely right. And being able to understand the business strategy for the company and then aligning your data strategy to the business strategy is important, right? But the underlying technical pieces need to be solid. So for example, how do you deal with large volumes of different types of data spread across the company? How do you manage that data? How do you understand the data? How do you govern that data? How do you then start leveraging the value of the data in the context of your business, right? So an understanding, deep understanding of the technology of collecting, organizing and analyzing that data is needed for you to be a successful CDO. So in terms of those skill sets that you're looking for and one of the things that Interpol said earlier in his keynote is that it's a rare individual who truly understands the idea of how to collect, store, analyze, curatize, monetize the data and then also has the soft skills of being able to navigate the organization, being able to be a change agent who is inspiring the rank and file. How do you recruit and retain talent? It seems to be a major challenge. Expertise is getting the right expertise in place and Interpol talked about it in his keynote which was the very first thing he did was bring in talent. Sometimes it's from outside of your company. Maybe you have talent that has grown up in your company. Maybe you have to go outside but you've got to bring in the right skills together, form the team that understands the technology and the business side of things and build this team. And that is essential for you to be a successful CDO. And to some extent, that's what Interpol has done. That's what the analytics CDO's office has done. Asad Darbrin, my boss, is the analytics CDO and he and the analytics CDO team actually hired people with different skills, data engineering skills, data science skills, visualization skills and then put this team together which understands the how to collect, govern, curate and analyze the data and then apply them in specific situations. A lot of talk about AI at this conference which seems to be finally happening. What do you see in the field or perhaps projects that you've worked on, examples of AI that are really having a meaningful business impact? Yeah, Paul, it's a very good question because the term AI is overused a lot as you can imagine, a lot of hype around it. But I think we are past that hype cycle and people are looking at how do I implement successful use cases and I stress the word use case, right? In my experience these, how I'm going to transform my business in one big boil, the ocean exercise does not work but if you have a very specific bounded use case that you can identify, the business tells you this is relevant, the business tells you what the metrics for success are and then you focus your attention, your efforts on that specific use case with the skills needed for that use case, then it's successful. So, examples of use cases from across the industries, right? I mean, everything that you can think of, customer-facing examples like how do I read the customer's mind? So when if I'm a business and I interact with my customers, can I anticipate what the customer is looking for? Maybe for a cross a lot of opportunity or maybe to reduce the call handling time when a customer calls into my call center or trying to segment my customer so I can do a proper promotion or a campaign for that customer. All of these are specific customer-facing examples. There are also examples of applying this internally to improve processes, capacity planning for your infrastructure. Can I predict when a system is likely to have an outage and or can I predict the traffic coming into my systems, into my infrastructure and provision capacity for that on demand? So all of these are interesting applications of AI in the enterprise. So when you're trying, I mean, one of the things we keep hearing is that we need data to tell a story, that data needs to, the data needs to be compelling enough so that the people, the data scientists get it, but then also the other kinds of business decision makers get it too. So what are sort of the best practices that have emerged from your experience in terms of being able for your data to tell the story that you wanted to tell? Yeah, well, I mean, if the pattern doesn't exist in the data, then no amount of fancy algorithms can help, you know? So, and sometimes it's like searching for a needle in a haystack, but assuming, so I guess the first step is like I said, what is your use case? Once you have a clear understanding of your use case and success metrics for that use case, do you have the data to support that use case? So for example, if it's fraud detection, do you actually have the historical data to support the fraud use case? Sometimes you may have transactional data from your transaction data from your core enterprise systems, but that may not be enough. You may need to augment it with external data, third-party data, maybe unstructured data that goes along with your transaction data. So question is, can you identify the data that is needed to support the use case? And if so, can I do, is that data clean? Is that data, do you understand the lineage of the data, who has touched and modified the data, who owns the data, so that I can then start building predictive models and machine learning deep learning models with that data? So use case, do you have the data to support the use case? Do you understand how the data reached you? Then comes the process of applying machine learning algorithms and deep learning algorithms against the data. One of the risks of machine learning and particularly deep learning, I think, is it becomes kind of a black box and people can fall into the trap of just believing what comes back, regardless of whether the algorithms are really sound or the data is sound. What is the responsibility of data scientists to sort of show their work? Yeah, Paul, this is fascinating and not completely solved area, right? So bias detection, can I explain how my model behaved? Can I ensure that the models are fair in their predictions? So there is a lot of research, a lot of innovation happening in the space. IBM is investing a lot in the space. We call trust and transparency. Being able to explain a model, it's got multiple levels to it. You need some level of AI governance itself. So just like we talked about data governance, there is the notion of AI governance, which is what version of the model was used to make a prediction? What were the inputs that went into that model? What were the features that were used to make a certain prediction? What was the prediction and how did that match up with ground truth? You need to be able to capture all that information. But beyond that, we have got actual mechanisms in place that IBM research is developing to look at bias detection. So pre-processing, during execution, post-processing, can I look for bias in how my models behave and do I have mechanisms to mitigate that? So one example is the open source Python library called AIF360 that comes from IBM's research and is contributed to the open source community. You can look at, there are mechanisms to look at bias and provide some level of bias mitigation as part of your model building exercises. And is the bias mitigation, does it have to do with, and I'm going to use an IBM term of art here, the human in the loop? I mean, is how much are you actually looking at the humans that are part of this process? Humans are, at least at this point in time, humans are very much in the loop. This notion of pure AI where humans are completely outside the loop is, we are not there yet. So very much something that the system, can it provide a set of recommendations? Can it provide a set of explanations? And can someone who understands the business look at it and take collective action as needed? There has been, however, to Rebecca's point, some prominent people, including Bill Gates, who have speculated that AI could ultimately be a negative for humans. What is the responsibility of companies like IBM to ensure that humans are kept in the loop? I think, at least at this point, IBM's view is humans are an essential part of AI. In fact, we don't even use the term artificial intelligence that much, we call it augmented intelligence. Where the system is presenting a set of recommendations, expert advice to the human who can then make a decision. So for example, my team worked with a prominent healthcare provider on models for predicting patient death in the case of sepsis onset. This is, we are talking literally life and death decisions being made. And this is not something that you can just automate and throw it into a magic black box and have a decision being made. So this is absolutely a place where people with deep domain knowledge are supported, are augmented with AI to make better decisions. That's where I think we are today. As to what will happen five years from now, I can't predict that yet. Well, I actually want to bring this up to both of you. The role, so you are helping doctors make these decisions, not just, this is what the computer program says about this patient's symptoms here, but this is really, so you're helping the doctor make better decisions. What about the doctor's gut and his or her intuition too? I mean, what is the role of that in the future? I think it goes away. I mean, I think the intuition really will be trumped by data in the long term, because you can't argue with the facts, much as some people do these days. But I don't know. I mean, you have to break there for some reason. Interesting, in your perspective on that, is there, should there always be a human on the front lines who is being supported by the back end? Or would you see a scenario where an AI is making decisions, customer facing decisions, that really are a life and death decision? So I think in the consumer industry, I can definitely see AI making decisions on its own. So if, let's say, a recommender system, which says, I think John Thomas Bart, these last five things online, he's likely to buy this other thing. Let's make an offer to him. I don't need a human in the loop for that. No harm, right. It's pretty straightforward. It's already happening in a big way. But when it comes to some of these- Proofing a mortgage, how about that one? That's where bias creeps in a lot, yeah. Even that, I think, can be automated. Can be automated. If the thresholds are said to be what the business is comfortable with, where it says, okay, about this probability level, I don't really need a human to look at this. But, and if it is below this level, I do want someone to look at this. That is relatively straightforward, right? But if it is a decision about life or death situations or something that affects the very fabric of the business that you are in, then you probably want a domain expert to look at it. And most enterprise use cases will fall, lean towards that category. Correct me. These are big questions. These are hard questions. These are hard questions, yes. Well John, thank you so much for coming on theCUBE. We really had a great time with you. No, thank you for having me. I'm Rebecca Knight for Paul Gillan. We will have more from theCUBE's live coverage of IBM CDO here in Boston, just after this.