 from the campus of MIT in Cambridge, Massachusetts. It's theCUBE, covering the MIT Chief Data Officer and the Information Quality Symposium. Now, here are your hosts, Stu Miniman and Paul Gillan. Welcome back, this is theCUBE, SiliconANGLE, Wikibon's traveling live streaming video platform. We go around the country to conferences all over the country, all over the world in fact, and bring you live interviews with the people who are making those conferences work. Thought leaders like Bill Winkler who joins us now, Chief Technology Officer Global IDs. Bill's giving a session later today on data quality and customer experience. And Bill, thanks for joining us. Thanks for having me. I was interested by that topic, data quality and customer experience. What does data quality have to do with customer experience? Can you make that real for us? What does it have to do with it? So if you think of what's happening today that many organizations are trying to reshape their customer experience, they're digitizing it, they're moving it to mobile devices. In the past, if you were calling a customer service rep for data quality issues that the service rep was dealing with, they could have buffered that, but now it's front and center in the screen. So as an example, you went to a website and ordered a t-shirt, and what was delivered was something different than what you had. It's indicative of some sort of data quality problem or system problem. And so there's a relationship we believe between the customer experience and the defects and the data that are driving those interactions. And you talk about metrics, data quality metrics, and really knowing where those misfires are. What kind of metrics are you talking about? So it could be a variety of things, but if you use an example from Telecom, my background, service orders drive everything. The customer orders equipment, they order network and small defects in the order. Missing pieces of information, wrong addresses, will cause automated processes to fail, which invariably will cause rework. Some of that will result in inferior customer experience. So Bill, we talk to most people when they're talking about data. It's either about using data to create new revenue streams or leveraging data to help save money. Talk to us about how global IDs is helping customers with either making or saving money with data. Making or saving money. A lot of people have a surprising amount of duplicate data that as, and they don't actually, a lot of firms don't actually realize the extent to which they have it. So what our firm does is, our software does is scan, we use automation, and that we discover what the data means through automation, but also where you have duplication. So we can tell that it's the same content or roughly the same content. We can also tell you where it originated from. Through profiling, we can say, this data set appears to be originating from this system and then it flows an hour later to here, two hours later here. So there's tremendous amount of money that's tied up in, say, sand storage. So if you can cut your sand storage, you move data to a lower cost like NFS as an example, you can reduce savings. We can show you where I have dormant databases so you can retire them or at least build strategies to reduce costs. I've heard estimates that in some companies as much as 85% of corporate data is in Excel spreadsheets. And sitting on desktops, not even on a corporate sand, what do you do about that problem? How do you get data out of the hands of the people who cling to it and into some sort of shared space where it can actually be managed for quality? It's a good question. I think that varies depending on the industry. I think it varies based on what they're doing. We do see a lot of Excel spreadsheets and we can profile those as well. I don't have a good answer for that question. But you can profile, I mean, how do you do that? Everything that Excel of different forms of documents that can be converted into tabular form, we can bring it in and we can make sense of it. So if you were extracting data from relational database into Excel, the same data elements are gonna flow through. You may have different labels on the columns, but our software doesn't care. It's actually looking at data values, patterns, relationships among different columns to say, this is the same stuff. This is where this Excel spreadsheet likely came from. What about the value of the data itself? Not all data is created equal. How do you kind of separate the high quality data versus low quality, help customers work through that? That's a great question. So some of it is taking advantage of that duplication. So the most valuable data in our experience is data that is duplicated because that's what people want. And so it's seeing the same stuff over and over again in multiple databases where the software can correlate and say, ah, this must be important. I don't know necessarily what it means, but I see the same patterns over and over again. Second part of your question was around quality. So consistency is an issue. So as things get duplicated from place to place, they may, there may be processes to keep it synchronized, but occasionally there'll be operational errors or different things that cause things to get out of sync. So we can also measure, knowing where data originated, we can measure how close this replica is to the original one. We can find things that are places where you should invest perhaps some money to clean it up. But it's through this domain profiling that we can find from an organization that might have 70 million columns of data. It distills down to something that's probably in the order of about 15,000 data elements that are common across those, of which maybe 500 are ones that would be core to the business. That seems to be pretty common. Is there a technology solution to this data getting out of hand? I mean, blockchain or something that can stamp this as the one true, legitimate version of the truth? I wish, I don't have an answer to that either. Well then talk about where your customers are looking for data quality. I mean, IBM's CU Suite study last year found the customer experience was the number one priority of business executives in not just North America, but in Europe and elsewhere in the world. Where are they spending now on data quality? Where are they investing now in data quality as a driver of better customer experience? So I will tell you that what we're proposing is more of an experiment to try to answer those questions. That Thomas Redman has said many times, he's written a book on it, Data Driven, where he's talked about that the same techniques that we use for manufacturing to control quality, Six Sigma control charts apply to data as well. And so what we're proposing is that we work with customers that have, can identify a customer service problem that relates to data. And use those same techniques to be able to not only demonstrate the cost of the errors, but also to be able to, once we correct it, demonstrate the savings that you get once you do it. We think that if we can get a few key projects like that, that story, that example can build on itself. Our customer satisfaction is notoriously difficult to measure accurately. Before we began reporting, we were talking briefly about the Net Promoter Score, which is kind of considered a kind of gold standard by many people of customer satisfaction. You said that even that score can be manipulated. So you see it when you go into, you go to your automobile dealership and have your oil change and they say, by the way, rate me a nine or a 10. So that's evidence that the system is in place. There's evidence that there's emphasis in that company on improving customer service, but it's backfiring. The other thing that you see with Net Promoter Score is it's sort of a lagging indicator that you can't ask people over and over again, how was your experience, how was your experience. So the thought process is that there's some correlation, we believe, between data defects, the data that is feeding your experience and your experience. If we can measure that, then maybe we have a surrogate measurement that we can use to gauge customer experience. So Bill, your company's been here for a couple of years and last year we interviewed your CEO here on theCUBE. I'm curious, can you give us kind of, as you look from year to year, how are we progressing? Are companies getting better at sorting out metadata and managing their data? Are we making progress on CDO development? There are places that are very encouraging. There are other aspects where it's unclear how systems become simpler, that you would think that systems have gotten as complex as they possibly can, but you merge two companies that have enormously complex systems and there's no way to unravel it in a reasonable length of time. So if you go out 50 years, how can you envision the way this gets, it's unclear to me how it gets sorted out. But there is a lot of interest in data. I see a lot of, with machine learning, emphasis on being able to pull data in to do very innovative things. With that comes privacy concerns. So we see a lot of concern around where is my sensitive data stored. So it's a very exciting area, but there's a lot of work left to be done. And you just got to do it. So, Bill Winkler, we're out of time. Thank you very much for joining us here. Thanks for having me. Best of luck with your continuing efforts to simplify data complexity of your clients' companies. Thank you. This is theCUBE. We will be back shortly with more guests from the MIT Chief Data Officer Information Quality Symposium in Gamers Mass.