 Live from Miami, Florida, it's theCUBE. Covering IBM's data and AI forums, brought to you by IBM. Welcome back to the port of Miami, everybody. We're here at the Intercontinental Hotel. You're watching theCUBE, the leader in live tech coverage. Seth Doberton is here. He's the vice president of data and AI and the chief data officer of cloud and cognitive software at IBM. Seth, good to see you again. Yeah, good to see you, Dave. Thanks for having me. Data and AI forum, hashtag data AI. It's amazing here, 1,700 people. Everybody's kind of hands on, appetite for learning. Yeah. What do you see out in the marketplace? You know, what's new since we last talked? Well, so I think if you look at some of the things that are really needed in the marketplace, it's really been around filling the skill shortage and how do you operationalize and industrialize your AI. And so there's been a real need for things, ways to get more productivity out of your data scientists. Not necessarily replace them, but how do you get more productivity? And we just released a few months ago, something called auto AI, which really is probably the only tool out there that automates the end to end pipeline, automates 80% of the work on the end to end pipeline, but isn't a black box. It actually kicks out code so your data scientists can then take it, optimize it further and understand it and really feel more comfortable about it. So it's kind of AI for AI, is that right? That's exactly what it is, AI for AI. So how does that work? You're applying machine intelligence to data to make AI more productive, pick algorithms best fit? Yeah, so it does basically you feed it your data and it identifies the features that are important. It does feature engineering for you. It does model selection for you. It does hyper parameter tuning and optimization and it does deployment and also monitors for bias. So what's the data scientists do? The data scientist takes the code out the back end and really there's some tweaks that the model, maybe the auto AI, maybe not get it perfect and really customize it for the business and the needs of the business that the auto AI may not understand. So the data scientist then can he or she can apply it in a way that is unique to their business that essentially becomes their IP. So it's not like generic AI for everybody. It's customized by, and that's where data scientists complain they don't have the time to do this because they're wrangling data. Exactly, and it was built in a combination from IBM research. Some of the great assets at IBM research plus some Kaggle masters that work here at IBM that really designed and optimized the algorithm selection and things like that. And then at the keynote today, Wonderman Thompson was up there talking and this is probably one of the most impactful use cases of auto AI to date. And it was also my former team, the data science elite team was engaged, but Wonderman Thompson had this problem where they had like 17,000 features in their data sets. And what they wanted to do was they wanted to be able to have a custom solution for their customers. And so every time they get a customer that have to have a data scientist that would sit down and figure out what are the right features and how they engineer them for this customer. It was an intractable problem for them. The person from Wonderman Thompson that presented today said he's been trying to solve this problem for eight years. Auto AI plus the data science elite team solved the forum in two months. And after that two months, it went right into production. So in this case, auto AI isn't doing the whole pipeline. It's helping them identify the features and engineering the features that are important and giving them a head start on the model. What's the acquisition model for auto AI? Is it a licensed software product? Is it a SaaS? How do I get it? It's part of CloudPak for data and it's available on IBM Cloud. So it's on IBM Cloud. You can use it pay per use. So you get a license as part of Watson Studio on IBM Cloud. If you invest in CloudPak for data, it can be a perpetual license or a committed term license, which is essentially a SaaS model. And it's essentially a feature add-on of CloudPak for data. It's part of CloudPak for data. And you're saying it can be usage-based. So that's key, right? It's consumption-based. So CloudPak for data is all consumption-based. So people want to use AI for competitive advantage. I said it my open that we're not marching to the cadence of Moore's law in this industry anymore. It's a combination of data, AI, and then Cloud for scale. So people want competitive advantage. You've talked about some things that folks are doing to gain that competitive advantage. But at the same time, we heard from Rob Thomas that only about four to 10% penetration for AI. What are the key blockers that you see and how you're knocking them down? Well, I think there's a number of key blockers. So one is access to data, right? Companies have tons of data, but being able to even know what data is there, being able to pull it all together, and being able to do it in a way that is compliant with regulation. Because you can't do AI in a vacuum. You have to do it in the context of ever-increasing regulation like GDPR and CCPA and all these other data privacy regulations that are popping up. So that's really two. So access to data and regulation can be blockers. The second one, or the third one, is really access to appropriate skills, which we talked a little bit about. And how do you retrain or how do you upskill the talent you have? And then how do you actually bring in new talent that can execute what you want? And then sometimes in some companies, it's a lack of strategy with appropriate measurement, right? So what is your AI strategy and how are you going to measure success? And you and I have talked about this on Cube before, you got to measure your success in dollars and cents, right? Cost savings, net new revenue, that's really all your CFOs care about. That's how you have to be able to measure and monitor your success. Yeah, so that last one is probably where most organizations start. Let's prioritize the use cases that will give us the best bang for the buck. And then the business guys probably get really excited and say, okay, let's go, but to truly operationalize it, you got to worry about these other things. You know, the compliance issues and you got to have the skill sets to scale. And sometimes that's actually, the first thing you said is sometimes a mistake. So focusing on the one that's got the most bang for the buck is not necessarily the best place to start for a couple of reasons. So one is you may not have the right data, it may not be available, it may not be governed properly. Number one, number two, the business that you're building it for may not be ready to consume it, right? They may not be either bought in or the processes need to change so much or something like that that it's not going to get used. And you can build the best AI in the world. If it doesn't get used, it creates zero value, right? And so you really want to focus on for the first couple of projects, what are the one that we can deliver the best value, not necessarily the most value, but the best value in the shortest amount of time and ensure that it gets into production. Because especially when you're starting off, if you don't show adoption, people are going to lose interest. Seth, what are you seeing in terms of experimentation now in the customer base? You know, when you talk to buyers and you talk about, you know, you look at the IT spending surveys, people are concerned about, you know, tariffs, you know, the trade wars, the 2020 election, they're being a little bit cautious. But in the last two or three years, there's been a lot of experimentation going on. And a big part of that is AI and machine learning. What are you seeing in terms of that experimentation turning into actually production projects that we can learn from and maybe do some new experiments? Yeah, and I think it depends on how you're doing the experiments. I think there's kind of academic experimentation where you have data science teams that come work on cool stuff that may or may not have business value and may or may not be implemented, right? They just kind of latch on. The business isn't really involved. They latch on, they do projects. And I think that's actually bad experimentation if you let it that run your program. The good experimentation is when you start having a strategy, you identify the use cases you want to go after and you experiment by leveraging agile to deliver these methodologies. You deliver value in two-week sprints and you can start delivering value quickly. You know, in the case of Wonderman Thompson again, eight weeks, four sprints, they got value. That was an experiment, right? That was an experiment because it was done agile methodologies using good coding practices, using good, you know, kind of design up front practices. They were able to take that and put it right into production. If you're doing experimentation, you have to rewrite your code at the end and it's a waste of time. To your earlier point, the moonshots oftentimes could be too risky and if you blow it on a moonshot, it could set you back, you know, years. So you got to be careful, pick your spots, pick ones that are maybe representative but are maybe lower risk and apply agile methodologies, get a quick return, learn, develop those skills and then build up to the moonshots. Or you break that moonshot down into consumable pieces, right? Because the moonshot may take you two years to get to but maybe there are sub-components of that moonshot that you could deliver in three, four months and you start delivering those and you work up to the moonshot. I always like to ask the dog-fooding, people, you know, like when I say, I like to call it sipping your own champagne. What have you guys done internally? When we first met, it was in I think a snowy day in Boston, right at the Spark Summit years ago and you did a big career switch and it's obviously working out for you but what are some of the things, and you were in part brought in to help IBM internally as well as Interpol, help IBM really become data-driven internally. How has that gone? What have you learned and how are you taking that to customers? Yeah, so I was hired three years ago now, I believe it was that long, to lead our internal transformation over the last couple of years, I don't want to say distracted, there were really important business things I need to focus on like GDPR and helping our customers get up and running with data science and AI, building a data science elite team. So as of a couple of months ago, I'm back, you know, almost entirely focused on our internal transformation. And you know, it's really about making sure that we use data and AI to make appropriate decisions. And so now we have, you know, we have an app on our phone that leverages Cognos Analytics, where at any point, Ginny Rametti or Rob Thomas or Arvind Krishna can pull up and look in what we call EPM, which is enterprise performance management, and understand where the business is, right? What did we do in third quarter, which just wrapped up? What's the pipeline for fourth quarter? And it's at your fingertips. We're working on revamping our planning cycle. So today planning has been done in Excel. We're leveraging planning analytics, which is a great planning and scenario planning tool. That with the tip of a button really, click of a button really lets you understand how your business can perform in the future and what things you need to do to get it performed. We're also looking across all of cloud and cognitive software, which data and AI sits in, and within each business unit in cloud and cognitive software, the sales teams do a great job of cross-sell upsell. But there's a huge opportunity of how do we cross-sell upsell across the five different businesses that live inside of cloud and cognitive software? So data and AI, hybrid cloud integration, IBM cloud, cognitive applications, and IBM security, there's a lot of potential interplay that our customers do across there and providing AI that helps the sales people understand when they can create more value, excuse me, for our customers. You know, it's interesting, this is the 10th year of doing theCUBE, and when we first started, it was sort of the beginning of the big data craze. And a lot of people said, oh, okay, here's the disruption, crossing the chasm, innovators dilemma, all that old stuff's going away, all the new stuff's coming in. But you mentioned, you know, Cognos on mobile. Then that's, this is the thing we learned is that the key ingredients to data strategies comprise the existing systems, right? You just throw those out. Those are the systems of record that were the single version of the truth, if you will, that people trusted. You go back to trust. And then all this other stuff built up around it, which kind of created dissonance. And so it sounds like one of the initiatives that you and IBM have been working on is really bringing in the new pieces, modernizing sort of the existing, so that you've got sort of consistent data sets that people can work on. Yeah, and one of the capabilities that really has enabled this transformation in the last six months for us internally and for our clients inside a cloud pack for data, we have this capability called IBM Data Virtualization, which we have all these independent sources of truth to some extent, you know, and then we have all these other data sources that may or may not be as trusted, but to be able to bring them together literally with the click of a button, you drop your data sources in, the AI within data virtualization actually identifies keys across the different things so you can link your data, you look at it, you check it, and it really enables you to do this at scale. And all you need to do is say, point it at the data. Here's the IP address of where the data lives, and it will bring that in and help you connect it. So you mentioned variances in data quality and the consumer of the data has to have trust in that data. Can you use machine intelligence and AI to sort of give you a data confidence meter, if you will? Yeah, so there's two things that we use for data confidence. I call it dodging this factor, right? Understanding what the dodging this factor is of the data. So we definitely leverage AI. So AI, if you have a data dictionary and you have metadata, the AI can understand data quality and it can also look at what your data stewards do and it can do some of the remediation of the data quality issues. But we all, in Watson Knowledge Catalog, which again is in Cloud Pack for data, we also have the ability to vote up and vote down data. So as the team is using data internally, if there's a data set that had a high data quality score but it wasn't really valuable, it'll get voted down. And it'll help when you search for data in the system, it will sort it kind of like you do a search on the internet and it'll down rank that one depending on how many down votes it got. So it's a wisdom of the crowd type of thing? It's a crowd sourcing combined with the AI. Has that in your experience at all changed the dynamics of politics within organizations? In other words, I'm sure we've all been in a lot of meetings where somebody puts forth some data and if the most senior person in the room doesn't like the data or doesn't like the implication, he or she will attack the data source and then the meeting's over. And it might not necessarily be the best decision for the organization. Yeah, so I think it's maybe not upvoting, downvoting that does that but it's things like the EPM tool that I said we have here. There is a single source of truth for our finance data. It's on everyone's phone who needs access to it. When you have a conversation about how the company or the division or the business unit is performing financially, it comes from EPM. Whether it's in the Cognos app or whether it's in a dashboard, a separate dashboard in Cognos or it's being fed into an AI that we're building. This is the source of truth. Similarly for product data, how our individual product's performing. It comes from here's. So the conversation at these senior meetings are no longer, your data's different from my data. I don't believe it. You've eliminated that conversation. This is the data, this is the only data. Now you can have a conversation about what's really important. An adult conversation, okay. But now what are we going to do about it? It's not a bickering about my data versus your data. So what's next for you on, you know, you've been pulled into a lot of different places. Again, you started at IBM as an internal transformation change agent. You got pulled into a lot of customer situations because you know what you're doing. So sales guys want to drag you along and help facilitate activity with clients. What's new, what's next for you? So really, you know, I've only been refocused on the internal transformation for a couple of months now. So really extending IBM, our cloud and cognitive software data and AI strategy and starting to quickly implement some of these products. So just like I just said, we're starting projects without even knowing what the prioritized list is. Intuitively, this one's important. The team's going to start working on it. And one of them is an AI project which is around cross cell upsell that I mentioned across the portfolio. And the other one we just got done talking about how in the senior leadership meeting for cloud and cognitive software, how do we all work from a Cognos dashboard instead of Excel data that's been exported, put into Excel. The challenge with that is not that people don't trust the data, it's that if there's a question, you can't drill down. So if there's a question about an Excel document or a PowerPoint that's up there, we'll get back to the next meeting in a month or in two weeks we'll have an email conversation about it. If it's presented in a real live dashboard, you can drill down and you can actually answer questions in real time. The value of that is immense because now as a leadership team, you can make a decision at that point and decide what direction you're going to do based on data. I said last question, I have one more question is, you're a CDO, but you're a polymath. And so my question is what should people look for in a chief data officer? What are sort of the characteristics and the attributes given your experience now? Yeah, so that's kind of a loaded question because there is no good job, single job description for a chief data officer. I don't even think there's a good solid set of skill sets to find for a chief data officer. And actually as part of the chief data officer summits that you guys attend, we're having sessions with the chief data officers kind of defining a curriculum for chief data officers with our clients so that we can help build the chief data officer in the future. But if you look at quality, so chief data officer is also a chief disruption officer. So it needs to be someone who is really good at and really good at driving change and really good at disrupting processes and getting people excited about it. Change is hard, people don't like change. How do you, you need someone who can get people excited about change? So that's one thing. And depending on what industry you're in, it's got to be, it could be if you're in financial or a heavy regulated industry, you want someone that understands governance. And that's kind of what Gartner and other analysts call a defensive CDO, very governance focused. And then you also have some CDOs, which I fit into this bucket, which is a more offensive CDO, which is how do you create value from data? How do you save money? How do you create net new revenue? How do you create new business models leveraging data and AI? And now there's kind of a third type of CDO emerging, which is a CDO not as a cost center, but a CDO as a P&L. How do you generate revenue for the business directly from your CDO office? I like that framework, right? I can't take credit for it, that's Gartner. Well, yeah, but it's good, you know, it's governance. They call it, what do you say? They call it defensive and then offensive and then the first time I met Interpol, he said, look, you start with how does data affect the monetization of my organization? And that means making money or saving money. Seth, thanks so much for coming on theCUBE. It was great to see you again. Yeah, thanks for having me. Good to see you again. All right, keep it right there, everybody. We'll be back at the IBM Data and AI Forum from Miami. You're watching theCUBE.