 Live from the MIT campus in Cambridge, Massachusetts, it's theCUBE, covering the 12th annual MIT Chief Data Officer and Information Quality Symposium, brought to you by SiliconANGLE Media. Welcome back to theCUBE's coverage of MIT CDOIQ here in Cambridge, Massachusetts. I'm your host, Rebecca Knight, along with my co-host, Peter Burris. We have two guests on this segment. We have Courtney Abercrombie. She's the founder of the non-profit AI Truth and Carl Gerber, who is the managing partner at Global Data Analytics Leaders. Thanks so much for coming on theCUBE, Courtney. Thank you. So I want to start out by just having you introduce yourselves to our viewers what you do. So tell us a little bit about AI Truth, Courtney. So this was born out of a passion as the last gig I had at IBM. Everybody knows me for Chief Data Officer and what I did with that. But the more recent role that I had was developing custom offerings for Fortune 500 in the AI solutions area. So as I would go meet and see different clients and talk with them and start to look at different processes for how you implement AI solutions, it became very clear that not everybody is attuned just because they're the ones funding the project or even initiating the purpose of the project. The business leaders don't necessarily know how these things work or run or what can go wrong with them. And then on the flip side of that, we have very ambitious, up and comer type data scientists who are just trying to fulfill the mission, the challenge at hand. And they get really swept up in it to the point where you can even see that data's getting bartered back and forth without any real governance over it or policies in place to say, hey, was that right? Should we have gotten that kind of information? There's, which leads us into things like the creepy factor, like Target and some of these cases that are well known. And so as I saw some of these mistakes happening that were costing brand reputation, our return on investment or possibly even creating opportunities for risk for the companies and for the business leaders, I felt like someone's got to take one for the team here and go out and start educating people on how this stuff actually works, what the issues can be and how to prevent those issues. And then also, what do you do when things do go wrong? How do you fix it? So that's the mission of AI Truth. And I have a book, use as far as the people. But really my main concern was concerned individuals because I think we've all been affected when we've sent an email and all of a sudden we get a weird ad and we're like, hey, what, they should not- It's someone reading my email. It's someone reading my email, you know, and we feel this just a fifth. And the answer is yes. Yes, and they are, they are. So I mean, but we need to know because the only way we can empower ourselves to do something is to actually know how it works. So that's what my mission is to try and do. So for the concerned individuals out there, I am writing a book to kind of encapsulate all the experiences that I had so people know where to look and what they can actually do because you'll be less fearful if you know, hey, I can download .go from my browser or my search engine, I mean, and Epic for my browser and some private offerings instead of the typical free offerings. There's not an answer for Facebook yet though. So we'll get there. Carl, tell us a little bit about global data analytics leaders. So I launched analytics leaders and CDO coach after a long career in corporate America. I started building an executive information system when I was in the military for a four-star commander. I've really done a lot in data analytics throughout my career, most recently starting the CDO function at two large multinational companies and leading global transformation programs. And what I've experienced is even though the industries may vary a little bit, the challenges are the same and the patterns of behavior are the same, both the good and bad behavior, bad habits around the data. And through the course of my career, I've developed these frameworks and playbooks just ways to give people outcome and bring these new technologies like machine learning to bear to really overcome the challenges that I've seen. And what I've seen is a lot of the current thinking is we're solving these data management problems manually. You know, we all hear the complaints about the people who are analysts and data scientists spending 78% of their time being a data gatherer and not really generating insight from the data itself and making it actionable. Well, that's why we have computer systems, right? But that large scale technology and automation hasn't really served us well because we think in silos, right? We fund these projects based on departments and divisions. We acquire companies through mergers and acquisitions. The CDO role has emerged because we need to think about all the data that an enterprise uses horizontally. And with that, I bring a high degree of automation, things like machine learning to solve those problems. So I'm now modeling that and advising my clients. But at the same time, the CDO role is where the CIO role was 20 years ago. We're really in its infancy. And so you see companies define it differently, have different expectations. People are filling the roles, may have not done this before. And so I provide the coaching services there. It's like a professional golfer who has a swing coach. So I come in and I help the data executives with upping their game. Well, it's interesting. I actually said the CIO role 40 years ago. But here's why. If we look back in the 1970s, hardcore financial systems were made possible by the technology, which allowed us to run businesses like a portfolio, Jack Welch, the GE model. That was not possible if you didn't have a common asset management system, if you didn't have a common cash management system, et cetera. And so when we started creating those common systems, we needed someone that could describe how that shared asset was going to be used with a new organization. And we went from the DP manager in HR, the DP manager within finance to the CIO. And in many respects, we're doing the same thing, right? We're talking about data in a lot of different places. And now the business is saying, we can bring this data together in new and interesting ways into more of a shared asset. And we need someone that can help administer that process and navigate between different groups and different needs and whatnot. Is that kind of what you guys are seeing? Oh yeah. Well, you know, once I get to talking. Well, I mean, for me, I keep going right back to the newer technologies like AI and IOT that are coming from externally into your organization. And then also the fact that we're seeing bartering of data at an unprecedented level before. And yet, you know, what the chief data officer role originally did was look at data internally and structured data mostly. But now we're asking them to step out of their comfort zone and start looking at all these unknown, you know, niche data broker firms that may or may not be ethical in how they're, I mean, look, I tell people, if you hear the word scrape, you run. No scraping. We don't want scraped data. No, no, no. But I mean, but that's what we're talking about. Well, what do you mean by scraped data? Because that's important. This is a well-known data science practice and it's not that nobody's being malicious here. Nobody's trying to have a malintent. But I think it's just data scientists are just scruffy. They roll up their sleeves and they get data however they can. And so the practice emerged. Look, they're built off of open source software and everything's free, right? For them, for the most part. So they just start reading in screens and things that are available that you could see. They can optical character read it in or they can do it however, without having to have a subscription to any of that data, without having to have permission to any of that data. I can see it, so it's mine. But you know, that doesn't work in candy stores. We can't just go, or jewelry stores in my case. I mean, you can't just say I like that diamond earring or whatever, I'm just gonna take it because I can see it. No, and the implications of that are, suddenly now you've got a great new business initiative and somebody finds out that you used their private data in that initiative and now they have a claim on that asset. Right, and this is where things start to get super hairy and you just wanna make sure that you're being on the up and up with your data practices and your data ethics because, I mean, in my opinion, 90% of what's going wrong in AI or the fear factor of AI is that your privacy's getting violated and they're getting labeled with data that you may or may not know even exists half the time. I mean, so what's the answer? I mean, as you were talking about these data scientists are scrappy, scruffy, roll up your sleeves kind of people and they are coming up with new ideas, new innovations that sometimes are good. I mean, even, but just like, so what is the answer? Is it this code of ethics? Is it a sort of similar to a Hippocratic oath? I mean, how would you, what do you think? It's a multi-dimensional problem. Courtney and I were talking earlier that you have to have more transparency into the models that you're creating and that means a significant validation process and that's where the chief data officer partners with folks in risk and other areas and the data science team around getting more transparency and visibility into what's the data that's feeding into it? Is it really the authoritative data of the company and as Courtney points out, we even have the rights to that data that's feeding our models and so by bringing that transparency and a little more validation before you actually start making key business decisions on the outcomes of these models, you need to look at how you're vetting them. And the vetting process is part technology, part culture, part of our process. It goes back to that people process technology trying. Yeah, absolutely. You know, know where your data came from. Why are you doing this model? What are you going to do with the outcomes? Are you actually going to do something with it or are you going to ignore it? Under what conditions will you empower a decision maker to use the information that is the output of the model? There's a lot of these things you have to think through when you want to operationalize it. It's not just, I'm going to go get a bunch of data wherever I can. I put a model together here. Don't you like the results? But this is the Silicon Valley way, right? An MVP for everything and you just let it run until you can. That's a great point, Courtney. And as you know, I've always believed and I want to test this with you. We talk about people process technology about information. We never talk about people process technology and information of information. Because the many respects that we're talking about is making explicit the information about information, the metadata, and how we manage that and how we treat that and how we diffuse that and how we turn that, the metadata itself into models to try to govern and guide utilization of this. That's especially important in the AI world, isn't it? I start with this. For me, it's simple. I mean, but everything he said was true. But I try to keep it to this. It's about free will. If I said you can do that with my data, to me it's always my data. I don't care if it's on Facebook. I don't care where it is and I don't care if it's free or not. It's still my data. Even if it's X23 and me, or 23 and me, sorry, and they've taken a swab or whether it's Facebook or I did a Google search, I don't care. It's still my data. So if you ask me if it's okay to do a certain type of thing, then maybe I will consent to that. But I should at least be given an option and know be given the transparency. So it's all about free will. So in my mind, as long as you're always providing some sort of free will, the ability for me to have a decision to say, yes, I want to participate in that, or yes, you can label me as whatever label I'm getting a Trump or a pro-Hillary or a whatever name, whatever the issue of the day is, then I'm okay with that as long as I get a choice. But let's go back to, I wanna build on that if I can, because then I wanna ask you a question about it, Carl. The issue of free will presupposes that both sides know exactly what's going into the data. So for example, if I have a medical procedure, I can sit down in that form and I can say, whatever happens is my responsibility, but if bad things happen because of malfeasance, guess what, that piece of paper is worthless and I can sue because the doctor and the medical provider is supposed to know more about what's going on than I do. It's the same thing exists. You talked earlier about governance and some of the culture imperatives and transparency. Doesn't that same thing exist? And I'm gonna ask you a question, is that part of your nonprofit? Is to try to raise the bar for everybody, but doesn't that same notion exist that at the end of the day you don't, you do have information asymmetries. Both sides don't know how the data's being used because of the nature of data. Right, and that's why you're seeing the emergence of all these data privacy laws. And so what I'm advising executives and the board and my clients is we need to step back and think bigger about this. We need to think about it as not just GDPR, the European scope, it's global data privacy. And if we look at the motivation, why are we doing this? Are we doing it just because we have to be regulatory compliant because there's a law in the books? Or should we reframe it and say, this is really about the user experience, the customer experience, this is a touch point that my customers have with my company. How transparent should I be with what data I have about you, how I'm using it, and is there a way that I can turn this into a positive instead of it's just I'm doing this because I have to for regulatory compliance. So I believe if you really examine the motivation and look at it from more of a carrot and less of a stick, you're going to find that you're more motivated to do it, you're going to be more transparent with your customers and you're going to share and then you're ultimately going to protect that data more closely because you want to build that trust with your customers. And then lastly, let's face it, this is the data we want to analyze, right? This is the authenticated data we want to give to the data scientists. So I just flipped that whole thing on its head and we do it for these reasons and we increase the transparency and trust. So Courtney, let me bring it back to you. That presupposes, again, an up leveling of knowledge about data, privacy, not just for the executive, but also for the consumer. How are you going to do that? Personally, I'm going to come back to free will again and I'm also going to add harm impacts. We need to start thinking impact assessments instead of governance, quite frankly. We need to start looking at if I, you know, start using a FICO score as a proxy for another piece of information like a crime record in a certain district or whatever as a way to understand how responsible you are and whether or not your car is going to get broken into and now you have to pay more. If you always use a FICO score, for example, as a proxy for responsibility, which let's just face it, once a data scientist lands on to something, they share it with everybody because that's how they are, right? They love that and I love that about them, quite frankly, but what I don't like is it propagates and then before you know it, the people who are of lesser financial means it's getting propagated because now they're going to be, every AI pricing model is going to use FICO score. And they're priced out of the market. And they're priced out of the market and how is that fair? I mean, so, you know, and there's a whole group, I think you know about the fairness, accountability, transparency group that, you know, kind of watch dogs, this stuff, but I think business leaders as a whole don't really think through to that level. Like if I do this, then this, this, and this could occur. So what would be the one thing you could say to, if you could, this corporate America's listening, okay. Let's do impact assessments. If you're going to cost someone their livelihood or you're going to cost them thousands of dollars, then let's put more scrutiny. Let's put more government, validation to your point. Let's put some, because not everything needs the nth level. Like if I present you with a blue sweater instead of a red sweater on Google or whatever, you know, that's not going to harm you, but it will harm you if I give you a teacher assessment that's based on something that you have no control over and now you're fired because you've been laid off because your rating was bad. This is a great conversation. Let me, let me, let me add something different because we're saying it a different way and tell me if you agree. In many respects, it's does this practice increase inclusion or does this practice decrease inclusion? This is not some goofy social thing. This is, are you making your market bigger or are you making your market smaller? Because the last thing you want is that the, participation by people ends with you can't play because of some algorithmic response we had. So maybe your question of inclusion becomes a key issue. Would you agree with that? I do agree with it and I still think there's levels even to inclusion. Like, you know, being a part of the blue sweater collab versus that, versus I don't want to be a convict, you know, suddenly because of some record you found or association with someone else. And let's just face it, a lot of these algorithmic models do, do these kinds of things where they, they use like N plus one, you know what I'm saying? And so you're associated naturally with the next person closest to you and that's not always the right thing to do, right? So in some ways, and so I'm positing just a little bit of a new idea here, you're creating some policies whether you're being, we were just talking about this but whether you're being implicit about them or explicit, more likely you're being implicit because you're just, you're summarily deciding. Well, okay, I have just decided in the credit score example that if you don't have a good credit threshold but in wearing your policies and your corporate policy did it ever say that, you know, people of lesser financial means should be excluded from being able to have good car insurance for, because now the same goes with like Facebook. Some people feel like they're gonna have to opt out of life. I mean, if they don't, you know, you either use, I mean, seriously, when you think about grandparents who are excluded, you know, out in whatever 10 buck two place they live and all their families are somewhere else and the only way that they get to see is, you know, on Facebook. Go back to the issue you raised earlier about somebody who read my email, I can tell you, as a person with a couple of more elderly grandparents, they inadvertently shared some information with me on Facebook about a health condition that they had. You know how grotesque the response of Facebook was to that? And it affected me too, because they had my name in it. Right, and sometimes people are just stigmas. Sometimes things become a stigma as well. There's an emotional response. When I put the article out about why I left IBM to start this new AI truth nonprofit, the responses I got back that were so immediate were emotional responses about how this stuff affects people, that they're scared of what this means. Can people come after my kids or my grandkids? And if you think about how genetic information can get used, you're not just hosing yourself. I mean, breast cancer genes, I believe, aren't they like, you know, they run through families. So I, you know, if someone swabs my, and uses it and swabs it with other data, you know, people all of a sudden, not just me is affected, but my whole entire lineage. I mean, it's hard to think of that, but it's true. This is the real life and death. These are, these are. Not just today, but for the future. And in many respects, it's that notion of inclusion. It's going back to it. Now I'm making something up that, but not entirely, but going back to some of the stuff that you were talking about, Carl, the decisions we make about data today, we want to ensure that we know that there's value in the options for how we use that data in the future. So the issue of inclusion is not just about people, but it's also about other activities or other things that we might be able to do with data because of the nature of data. I think we always have to have an options approach to thinking about, as we make data decisions. Would you agree with that? Yes, because, you know, data's not absolute. So you can measure something and you can look at the data quality, you can look at the inputs to a model, whatever, but you still have to have that human element of, are we doing the right thing? You know, the data should guide us in our decisions, but I don't think it's ever an absolute. It's a range of options, and we chose this option for this reason. So are we doing the right thing and do no harm, too? Carl, Courtney, we could talk all day. This has been a really fun conversation. Oh, yeah, and we have. And Carl, go ahead. But we're out of time. I'm Rebecca Knight for Peter Burris. We will have more from MIT CDOIQ in just a little bit.