 From Cambridge, Massachusetts, it's theCube, covering MIT Chief Data Officer and Information Quality Symposium 2019, brought to you by SiliconANGLE Media. Welcome back to Cambridge, Massachusetts, everybody. You're watching theCube leader in live tech coverage. We're here covering the MIT CDO conference, MIT CDO IQ day two, we're wrapping up Bob Parr. He's here, he's a partner and principal at KPMG, and he's joined by SriKar Prishna, who's the Managing Director of Data Science, AI, and Innovation at KPMG. Gents, welcome to theCube. Thank you. All right, so let's start with your roles. So Bob, where do you focus? My focus, within KPMG, we've got three main business lines, audit, tax, and advisory, and so I'm the advisory Chief Data Officer. So I'm more focused on how we use data competitively in the market, more the offense side of our focus. So how do we make sure that our teams have the data they need to deliver value as much as possible, working concert with the enterprise CDO, who's more focused on our infrastructure, our standards, and security, privacy, and those. So you're focused on making KPMG better, as opposed to a KPMG client, okay. I also have a second hat, and I also serve financial services CDOs as well, so. Okay, so you get dragged out by the sales guys. Exactly. And Sreekar, what's your role? Yeah, you know, I focus a lot on data science, artificial intelligence, and overall innovation. So I actually represent a center of excellence within KPMG that focuses on AI, machine learning, natural language processing. And I work with Bob's division to actually advance the data side of the story, because all AI needs data, and without data there is no algorithms. So we are focusing a lot on how do we use AI to make data better, think about data quality, think about data lineage, think about all of the problems that data has. How can we make it better using algorithms? And I focus a lot on that working with Bob. So Bob, focus on internal data, KPMG, No, it's customers and internal. I mean, we are a horizontal within the firm, so we help customers, we help internal, we focus a lot on the market. So Bob, you mentioned used data offensively, so 10, 12 years ago, data was a liability. You had to get rid of it, you keep it no longer than you had to, because you're going to get sued, so email archive came in and obviously things flipped after the big data took off, so what are you seeing in terms of that shift from the defensive use of data to the offensive use? What does that all mean? Yeah, and it's really, when you think about, and let me define sort of offense versus defense, so on the defense side, historically, that's where most of the CDOs have played. That's risk, regulatory, reporting, privacy, even litigation support, those types of activities today. And really, until about a year and a half ago, we really saw most CDOs still really anchored in that. I run a forum with a number of CDOs and financial services, every year we get them together and we ask them the same set of questions. This was the first year where they said that, you know what, my primary focus now is growth, it's bringing efficiency, it's trying to generate value on the offensive side. It's not like the regulatory work's going away, certainly in the face of some of the pending privacy regulation, but it's a sign that the volume of use cases as the investments in their digital transformations are starting to kick up, as well as the volumes of data that are available, the raw material that's available to them in terms of third party data, in terms of the just the general volumes that exist that are streaming into their organization. And the overall literacy in the business units are creating this massive demand, and so they're having to respond. They're having to rethink that. Is this because they're getting a handle on the data, they're actually finding where it is, they're categorizing it, they're organizing it? That is still a challenge. I think it's better when you have a very narrow scope of critical data elements, going back to the structured data that we were talking about with the regulatory reporting. When you start to get into the offense, the generating value, getting into customer experience, really exploring that side of it, there's a ton of new muscle that has to be built. New muscle in terms of data quality, new muscle in terms of really more scalable, operating model. I think that's a big issue right now with CDOs, is we're used to that limited swath of CDEs and they've got a stewardship network that's very labor intensive, a lot of manual processes still, and they have some good basic technology, but it's a lot of it's rules-based. And when you think about how that constraints going to scale, when you have all of this demand, when you look at the customer experience, analytics that they want to do, when you look at just AI applied to things like operations, the demand and the focus there is going to start to create a fundamental shift. So Srikhar, one of the things that I've seen, and maybe it's just my small observation space, but I wonder if you could comment, is this seems like many CDOs are not directly involved in the AI initiatives of the organization. Clearly the chief digital officer is involved, but the CDOs are kind of in the background still, you're seeing that, and is that changing? That's a fantastic question, and I think this is where we are seeing some of the cutting edge change that is happening in the industry, and when Bob represented the idea that we can offensively look at data, this is what it is, that CDOs for a long time have become more reactive in their roles, and that is starting to come forefront now. So a lot of institutions we are working with are asking what's the next generation role of a CDO, and why are they in the background, and why are they not in the foreground? And this is when you become more offensive or proactive with data, and the digital officers are obviously focused on the transformation that has to happen, but the CDOs are their backbone in order to make the transformation real. And if the CDOs start to think about their data as an asset, data as a product, data as a service, the digital officers are right there, because those are the real, like the day-to-day that they are living, so CDO can really become from a back office to really become a business line. We've- Who do you see taking the reins in machine learning, in machine learning projects and the companies you work with? Who is driving that? Yeah, great question. So we are seeing different, I would put them in buckets, right? There is no one model fits all. We are seeing different generations within the companies. Some of the ones who are just testing out the market, they're still keeping it in their technology space, in their back office tech, IT, and forward IT, let me call that, where they are starting to experiment with this, but you see the mature organizations on the other end of the spectrum, they are integrating machine learning and AI right into the business line, because they want the CXOs having the technology right by their side so they can leverage AI and machine learning right for the business, right there. And that is where we are seeing some of the new models come out. I think the big shift from a CDO perspective is using AI to prep data for AI. That's fundamentally where the data science was distributed, some of that data science has to come back. So for data integration, for data quality. Yeah, for data prepping, because you've got all this data, third party and other, from customer streaming into the organization. And the work that you're doing around anomaly detection is it transcends developing the rules, doing the profiling, doing the rules, the very manual, the very labor intensive process. You got to get away from that. So you're talking about using it for this to be scalable. Algos and AI to figure out which algos to apply? To clean, to prepare the data, to see what algorithms we can use. So it's basically what we are calling AI for data rather than just data leading into AI. So it's, I mean, you know, we've developed a technology for one of our clients and it's a pretty large financial service. They were getting close to like a billion data points every day. And there was no way manually you could go through the same quality controls and all of those processes. So we automated it through algorithms. And these algorithms are learning the behavior of data as they flow into the organization. And they are able to proactively tell where problems are starting to emerge. And this is the new phase that we see in the industry. You cannot scale the traditional data governance using manual processes. We have to go to that next generation where AI, natural language processing, think about unstructured data, right? I mean, that is like 90% of the organization is unstructured data. And we have not talked about data quality. We have not talked about data governance for a lot of these sources of information. Now is the time, AI can do it. And I think that raised a great question as you look at unstructured and a lot of the data sources as you start to take more of an offensive stance will be unstructured. And the data quality, what it means to apply data quality isn't the profiling and the rules generation, the way you would with standard data. So the teams, the skills that CDOs have in their organizations have to change. You have to start to, and it's a great example where you guys are ingesting documents and there was handwriting all over the documents. Yeah, great example, Bob. Like, we would ask the client, like, is this document going to scan into the system so my algorithm can run? And they're like, yeah, yeah, everything is good. I mean, the data is there. But when you then start scanning it, you realize there's handwriting and the information is in the handwriting. So all the algorithms break down. Now how do you- Yeah, this is tribal knowledge. Tribal knowledge, exactly, exactly. So that's what we are seeing. If we talk about the digital transformation in data, in the CDO organization, it is this idea that nothing is left unseen. Some algorithm or some technology has seen everything that is coming into the organization, has a pair of eyes on it. So it can tell you where the problems are. And this is what algorithms do. They scale beautifully. So the data quality approaches are evolving, sort of changing. So rather than a heavy, heavy emphasis on masking or deduplication and things like that you would traditionally think of, participating in the data quality. Not that that goes away, but it's got to evolve to use machine intelligence. So what kind of skill sets do people need to achieve that? Is it the same people, or do we need to retrain them or bring in new skills? Yeah, no, great question. And I can talk from the perspective of where AI is disrupting every industry now that we know, right? But when you look at what skills are required, all of AI, including natural language processing, machine learning, still require human in the loop. And that is the training that goes in there. And who are the people who have that knowledge? It is the business analysts. It's the data analysts who are the knowledge bearers. The C-suite and the CDOs, they are able to make decisions, but the day-to-day is still with the data analysts. Those SMEs. Those SMEs. So we have to upskill them to really start interacting with these new technologies where they are the leaders rather than just waiting for answers to come through. And when that happens, now, me as a data scientist, my job is easy because the SMEs are there. I deploy the technology. The SMEs train the algorithms on a regular basis. Then it is a fully fungible model which is evolving with the business. And no longer am I spending time re-architecting my rules and what are the masking capabilities I need to have, it is evolving as the business goes. Does that change the number one problem that you hear from data scientists which is the 80% of their time is spent on wrangling, cleaning data, 10, 15, 20% on your fun stuff? If you run into SMEs being concerned that they're going to be replaced by the machine they're training. I actually see them being really enabled now where they are spending 80% of the time doing boring job of looking at data. Now they're spending 90% of their time looking at the elements which are creative and which requires human intelligence to say, hey, this is different because of X, Y, and Z. So let's go out, it sounds like a lot of what machine learning is being used for now in your domain is to clean things up. It's plumbing, it's basic foundation work. So go out three years after all that work has been done and the data is clean, where are your clients talking about going next with machine learning? Bob, do you want to take a look at that? I mean, it's a whole, it varies by industry, obviously, but it covers the gamut from, and it's generally tied to what's driving their strategy. So if you look at a financial services organization as an example today, you're going to have really AI driving a lot of the behind the scenes on the customer experience. You know today with your credit card company it's behind the scenes doing fraud detection. You know, that's going to continue. So it's when you take the critical functions that we're more data, it makes better models, that's just going to explode. And I think they're really, you can look across all the functions from finance to marketing to operations. I mean, it's going to be pervasive across all of that. So if I may add on top of what Bob was saying, I think what our clients are asking is, how can I accelerate the decision making? Because at the end of the day, all our leaders are focused on making decisions and all of this data science is leading up to the decision. And today you see what you brought up, like 80% of the time is wasted in cleaning the data, so only 20% time was spent in real experimentation and analytics, so your decision making time was reduced to 20% of the effort that I put in the pipeline. What if now I can make it 80% of the time that I put in the pipeline, better decisions are going to come on the other end. So when I go into a meeting and I'm saying like, hey, can you show me what happened in this particular region or in this particular part of the country, previously it would have been like, oh, can you come back in two weeks? I will have the data ready and I will tell you the answer. But in two weeks, the business has ran away and the CDO or the C-suite doesn't require the same answer. But where we are headed is as the data quality improves, you can get to real time questions and decisions. So decision support, business intelligence. That's right, yes. Well, we're getting better as an industry. Used to be six months to build a cube, right? Now it's two weeks, still not good enough for moving too fast. As I was saying, data is plentiful, insights aren't. Yes. So in your view, will machine intelligence finally close that gap and get us closer to real time decision making? It will eventually, but there's so much that we need to, our industry needs to understand first and really ingrain. And today there is still a fundamental trust issues with AI. We've done a lot of work. Why, because it's black box or? Yeah, part of it, part of it. I think the research we've done and some of this is nine countries, 2,400 senior executives, and we ask them a lot of questions around their data and trust and analytics. And 92% of them came back with, they have some fundamental trust issues with their data and their analytics. And they feel like there's reputational risk, material reputational risk. This isn't getting one little number wrong on one of the reports. That's a more of a systemic issue. We also do a CEO study, and we've done this many years in a row. Going back to 2017, we started to ask them, okay, making a lot of companies are data driven. When it comes to- Or they say they're data driven. Well, they say they're data driven and that's the point. At the end of the day, they're making strategic decisions where you have an insight that's not intuitive. You trust your gut, or you go with the analytics. Back then, 67% said they go with their gut. So, okay, this is 2017. This industry's moving quickly. There's tons and tons of investment. Look at it in 2018. Go down? No, it went up. 78%. They haven't read Moneyball. So it's not an awareness issue. There is something more fundamentally wrong and you hit it on, part of it's black box. And part of it's the data quality and part of it's bias. And there's all of these things flowing around it. And so when we dug into that, we said, well, okay, if that exists, how are we going to help organizations get their arms around this issue and start digging into that trust issue? And really, the front part is exactly what we were talking about in terms of data quality. Both structured, more traditional approaches and unstructured, using the handwriting example and those types of techniques. But then you get into the models themselves and it's the critical things you gotta worry about is the lineage. So from an integrity perspective, where's the data coming from? Where are the sources? Where are the change controls on some of that? We need to look at explainability gained at the black box part where can you tell me the inferences, the decisions? Are those documented? And this is important for the SME, the human in the loop to get confidence in the algorithm as well as that executive group. So they understand there's a structured set of processes around that. And the money ball problem is actually pretty confined. It's pretty straightforward. I don't know, 32 teams are throwing minor leagues, but the data model is pretty consistent. True. The problem with organizations is that no data model is consistent in any organization. You mentioned risk, Bob. The other problem is organizational inertia. If they don't trust it. What does a P&L manager do when he or she wants to preserve their existing position? They attack the data. I don't believe that. Let me hear. Well, which is a fundamental point, which is culture. I mean, you can have all the data science and all the governance that you want, but if you don't work culture in parallel with all this, it's not going to stick. And that's, I think a lot of the leading organizations are starting to really dig into this. We hear a lot about literacy. We hear a lot about top-down support. What does that really mean? It means senior executives are placing bats around and linking demonstratively, linking the data and the role of data, data as an asset into their strategies and then messaging it out and being specific around the types of investments that are going to reinforce that business strategy. So that's absolutely critical. And then literacy, absolutely fundamental as well because it's not just the executives and the data scientists that have to get this. It's the guy-in-ops that you're trying to get to. They need to understand, not only tools, and it's less about the tools, but it's the techniques. So it's not the approaches being used are more transparent and that they're starting to also understand the issues of privacy and data usage rights. That's also something that we can't leave at the curb with all this innovation. It's also believing that there's an imperative. I mean, there's a lot of, for all the talk about digital transformation, and you hear it everywhere, everybody's trying to get digital right, but there's still a lot of complacency in the organization, in the lines of business, in operations. They say, hey, we're actually doing really well. We're in financial services or healthcare. It really hasn't been disrupted. Everybody says, oh, it's coming, it's coming. But there's still a lot of, well, I'll be retired by then, or hanging on to the past. Well, actually, it's also the fact that in the previous generation, if I had to go do a shopping, I would go into a shop, and if I wanted to buy an insurance product, I would call my insurance agent. But today, the new world, it's just a tap of my screen. I have to go from Amazon to some other app. And this is real. This is what is happening to all of our clients. Previously, they thought that customers bucketed them in different experience buckets. It's not anymore. It's real in front of them. So if you don't get into the digital transformation, a customer is not going to discount you by saying, oh, you are not Amazon, so I'm not going to expect that. You're still on my phone, and you're only two taps away. So you have to become real with digital. I was a little surprised that you said you see the next stage as being decision support rather than customer experience, because we hear that for CEOs, customer experience is top of mind right now. There are two differences, right? One is external facing is absolutely the customer. Internal facing, it's absolutely the decision making. Because that's how they are separating the internal versus the external. And most of the meetings that we go to, customer insights is the first place where analytics is starting, where data is being cleaned up, where questions are being asked about can I master my customer records? Can I do a good master of my vendor list? That is where they start, but all of that leads to good decision making to support the customers. So it's like that external towards the internal view. Well, and back to the offense versus defense and the shift, it absolutely is on the offense side. So it is with the customer. And that's a more direct link to the business strategy. So that's the area that's gained the money, the support, and people feel like they're making an impact with it there. When it's down here in some admin area, it's below the water line. And even though it's important and it flows up here, it doesn't get the visibility. So that's the big show. Guys, great conversation. Thanks for coming on. You've got to leave it there. Thank you for watching. We're right back with our next guest, Dave Vellante with Paul Gillan from MIT, CDOIQ, right back. You're watching theCUBE.