 Good afternoon and a very warm welcome to our webinar today on the age of algorithms ensuring explainable fairness. Today we are joined by our distinguished speaker Dr Kathy O'Neill, mathematician, data analyst and author. My name is Joyce O'Connor and I chaired the IIEA digital policy group here and I'd be the moderator of today's event. It's my great pleasure to welcome you, Kathy, to our meeting today. I'm delighted that you're joining us from Boston. Harvard Square, is that true? Thank you very much for being with us and for taking the time out of your very busy schedule. Kathy will speak to us for about 20 to 25 minutes and I'll go to you then our audience for Q&A. Please join our discussion using the Q&A function at the bottom of our screen. I look forward to receiving your questions. Today's presentation and the Q&A as ever are on the record and join us in our discussion on Twitter using the handle at IIEA. I think first before we get to Kathy, I think it might be useful to set the context in which we're looking at AI regulations in Europe and Ireland and other member states. The EU and the European Commission approach to digital innovation and technological progress, especially in the regulatory area, has been very much one looking at regulation. The digital agenda together with the green agenda has been the key priorities since Ursula van der Leyen became European Commission President almost three years ago. As we have seen over the last while, the EU has adopted a raft of regulation and it intends to lead in the creation of global regulatory norms for the digital economy. While the Digital Markets Act, the Digital Service Act, Data Governance and the CHIPS Act have worked through the legislative process, the Artificial Intelligence Act together with the Data Act are still in the legislative process. And I think it's fair to say there's been an extensive consultative process as well with all stakeholders. The EU Commission believes that the Proposed Artificial Intelligence Act should become the global standard if it is to be fully effective. The questions arise, will the AI Act boost the uptake of AI and guarantee a human centric approach? The Proposed AI Act uses a risk-based approach to classify AI systems with different requirements and obligations according to the intended purpose and level of risk. Cathy O'Neill today offers another perspective. Cathy will discuss how algorithms and big data pose risk to equality and social fairness and can promote discrimination. Cathy will explain how the concept of explainable fairness should be used to prevent and mitigate social harms caused by algorithms. She will propose that regulations should be translated into coding of algorithms instead of expecting lawyers and regulators to decipher mathematical formulas. Cathy, over to you. We look forward to your presentation. I should say, just if I may, Cathy, just give a little bit on your background, which I'd like our audience to know in more detail. Cathy graduated from Harvard with a PhD in Maths, lectured in MIT, was a math professor at Barnard College and left Epidemia to work in a hedge fund D-Shaw and as a data scientist in the New York ecosystem. She's CEO of Orca, an algorithmic auditing company and a member of the public interest tech lab at the Harvard Kennedy School. Her book, Weapons of Maths, Math Destruction, How Data Measures Inequality and Trettance Democracy is a bestseller at New York Times bestseller and was listed for the National Book Award for nonfiction. She operates a blog, mathbabe.org and is a contributor to Bloomberg View. Cathy, over to you. Thank you so much, Joyce. I really appreciate it. I have to acknowledge that I admit actually a confession that I didn't realize I only have 20 minutes, so I'm going to be skipping around. No, you can go on a little longer, 25 minutes if you find something. Okay. Yeah, that's okay. I can still skip some things, but that's okay I want to get to the main, main issue which is the explainable fairness framework. So, I wrote this book, Weapons of Math Destruction, and I'm going to present a little bit about it just because I think it's really good background. I performed triage on the world of algorithms, which I was writing in 2014 or so. They're already incredibly pervasive in most, I would say previously bureaucratic systems in the United States so like when you try to get a job when you try to get health care when you try to get a mortgage or insurance. Those were all being determined by algorithm even stuff about how long to go to prison if you're convicted for a crime. These were all really important and widespread algorithms that were secret scoring systems that people didn't understand it couldn't complain about. They were also destructive that which isn't just to say that they were sometimes wrong because all all systems are sometimes wrong, but they were sort of systematically wrong for specific types of people. So I just want to back up and mention what I mean by an algorithm. And I'd like to give this example of cooking dinner for my family. Well, so first the definition it's predicting the future based on patterns in the past. So predicting future success based on patterns in the past and whether something was successful or or failure. So for for cooking and dinner it's like I decide whether a dinner was successful I update my algorithm for cooking dinners. Of course I try to optimize success over time. The reason I'm showing a picture of my son with Nutella on his face is because his definition of success is very different from mine. Because for me, it's success if my kids eat vegetables for him it would be success if he got to have as much Nutella as he wants. So, one of the critical points there is that the definition of success determines the, the algorithm, essentially. Those of the data does to those are the two ingredients definition of success in the data. I think people underestimate the power of that definition of success though. And in particular I just want to point out that the people who control and deploy the algorithm, sort of are in the power. They define success. The people who are subject to the algorithm and the system that the algorithm is being used in are the ones that typically don't have the same definition of success, but don't have the power or sometimes even the expertise to complain about that. So, a couple examples from my book, or from from the time since my book came out in 2016. This is from my book, this is Sarah with sake she was a teacher fired based on an algorithm that no one could explain to her. She had recent to believe was was gamed and fall so when she tried to appeal she was told, it's an algorithm so we know it's fair, which is of course is, which is not true. So, for example, that is more recent coming from the world of medical systems so this is a health care health insurance company that wanted to improve costs for people with lots of lots of different medical problems like imagine that you have diabetes and heart disease and you broke your leg. The problem with such patients is that they are often given treatment that is in conflict with their different problems. So, and that's bad for this patient, but it's also bad for the insurance company because it's expensive. It's expensive to fix in particular. So they wanted to improve that inefficiency by offering help to people who had complexity, but they optimize their algorithm instead of complex to complexity they chose a different definition of success going back to my emphasis on that. And their definition of success, or was or risk anyway in this case was cost. So basically when they're looking for patterns in the past of somebody who's expensive. You know, so they're like people like you were expensive in the past so we predict you'll be expensive and therefore we're going to offer you help to navigate the system. The problem with that of course is that not everyone who had complex medical problems was expensive. It's a correlation of course but not complete. In particular, people that were under treated systematically, which includes black patients were expensive even though they did have complex medical needs. So, Optum's algorithm ended up systematically missing people that needed help that were black patients. And then in particular the people who figured this problem out. Who were also data scientists and doctors suggested a different proxy different definition of success that it sort of increased the number of black patients being offered this help by a factor of two. And so that's a great example of how if we're, if we're, if we miss measure, or we miss target, the outcome, or predicting the wrong thing essentially or asking the wrong question if you will. Then we tend to have algorithms that have real problems. So, for example, I wanted to get to which is the facial recognition problem on the joy Balamini and her colleagues looked into how accurate facial recognition really was for either Amazon or IBM or Microsoft's facial recognition and found that they were much more accurate for like paler and mailer faces they call the pale male training set that that was the essentially the underlying problem is that the algorithms were trained on picture databases that were predominantly white and male. So that was that's sort of a problem in the sense of its inconsistency, but it does. Yeah, so I'm, I guess I'm going to have to skip the part of this talk where I talk about how to audit for these kinds of problems. But I will just mention that just because IBM and probably Microsoft and probably Amazon even have improved their accuracy on black women. That doesn't actually mean the algorithm is fair. And the basic reason is, once you have an algorithm, and you're licensing it to their parties, including police departments. You, you are no longer in control of how it's being used. So in particular, if it's being used as a profiling tool only on, you know, young black men that it doesn't matter that it's accurate for white women, because it's being used unfairly. So I just, I'm just making the point that algorithm. If you continue on, you know, don't cut your talk short. I think we'd really love what you've heard. Okay, well, I'll come back to that then. Okay. The open questions in in my field of algorithmic auditing and indeed, I think the world of algorithms at large is that we don't really know, you know, often we don't exactly know what the context will be for a given algorithm so we don't know. We don't know what the outcomes mean per se. So we have this sort of business model where some people build the algorithms other people use them and however they want that's that's a that's what I've mentioned just now with facial recognition. Even if we had the context fixed. We don't even know what it means to define fair I've been using the word fair or unjust. What does it mean to be fair how do how do we determine we've gotten something we've got the even the right metric of to measure racial fairness. And then even if we had the correct measure of correct context sort of nailed down the definition of of racial equity nailed down. How fair, you know what is a threshold of fairness that we're willing to or unfairness if you will, that we're willing to accept and those are all important very very important, completely open questions that have not been decided pretty much in any context. There could be one exception. So, but at the same time, given that our, our bureaucracies are very quickly, you know, giving way to algorithms that automate these decisions of whether somebody is worthy of something. We actually have to answer this question. Right, we have to answer these questions for almost all you know for basically all of those high stakes situations. So the way we answer the questions might be different in the states and in Europe and so that's one of the things I want to talk about at the end, but I'll, I'll continue by mentioning that in my auditing company we really have three different types of auditing. We have what I call adversarial auditing, which is essentially when like a state attorney general asks us to, to help them investigate a specific company for harm against consumers typically. So you should, you should think of payday loans. So exorbitant interest rates or exorbitant fees of loans or, you know, could be subprime auto loans or student loans, that kind of thing. And there's again there's like specific companies and typically the agency that hires us has subpoena power to make them give us their data so we can infer how they treat customers based on that that data. So we kind of like reverse engineer their business model based on the data that they provide, which they don't want to provide, but it's required. And then there's a second type of audit, where it's called an invitational audit and that's when a company will say, Hey, Orca, please come in and take a look into our algorithm. We're a little bit concerned that it's either unethical or possibly illegal and we want you to determine whether, whether you agree or how to fix it. But it's like, we don't have as many invitational audits as we would like, in large part because going back to my previous slide like, we just actually don't know the answer to these questions of like, what do we mean by fairness, what are the thresholds and a particular regulators don't have a clear knowledge of that, which means that the companies even the regulated companies don't really feel very much pressure to check that their algorithms are are reasonable because they don't know the definition of reasonable. So, but and even so we've learned a lot from the invitation audits that we have been invited to do. And then finally, there's this sort of the third party audits we call them where we are regulatory audits where the idea is that we're the middleman between regulators and companies. So, if it's and right now we're working in the world of insurance so we work for some insurance right commissioners. And the idea is that we, that the, the, all the insurance companies have to submit data to the, to the regulators to the insurance commissioner, commissioner's office. And we are the ones who analyze it and make sure that the their business practices are lawful, in particular that they, you know, that they comply with anti discrimination law. So I'm going to talk about invitational audits a little bit and then I will probably spend more time on this third party audits, for which we have at orca we have a framework called explainable fairness. So I will do this quickly but the, the, the invitational audit is pretty is pretty comprehensive. We basically keep asking the same question over and over again like for whom does this work for whom does this fail. It gives way immediately to the notion of who are the stakeholders who actually care one way or the other about this algorithm. Who does this impact. And we just keep asking this question we build a matrix which I'll mention in the next slide, but the critical thing is that this is non technical, basically when you ask. This goes back to the very definition of an algorithm with my child, you know, like he doesn't get much of the say in how the algorithm cooking the dinner algorithm gets made, because I'm, I'm his mom and I'm in charge. And that's pretty clear power dynamics and it's reasonable, but in other situations it's not always clear why it's reasonable that the people in power, get to decide that, you know, how everything works. And so there's, I feel strongly feel that there's a sort of embedded kind of power play going on, where people are excluded from the conversation about how an algorithm should work. I'm basically told like you're not an expert on algorithm so you don't have an, you don't get an opinion you don't get to have an opinion. The great thing about the ethical matrix approach is that it is non technical like literally people are just like, yeah this is not. This is not about machine learning techniques this is about what you think is fair. Nobody considers themselves an expert on fairness. So the ethical matrix framework allows that conversation to happen outside of the world of data science. And then the idea is, it's kind of you can think of it as a ethical review board type of conversation. The idea would be that the sort of embedded ethical conundrums involved trolley problems are addressed by this stakeholder group representatives in a round table discussion. And then the values are surfaced. And that's when the data scientists come in and actually embed, you know, sort of try to translate those values into code so that so that we look that I guess that the meta point here is that we want to decouple the values conversation from the coding conversation. So we actually build the matrix by making a, as I said a two dimensional grid, where the rows of the stakeholders in their columns are their concerns, and we actually represent those stakeholders as much as we can at this in this conversation. And then once we have this very large thing could be very large. We consider the cells of it and we rank the cells basically color code them and rank them based on what their, if this concern is actually happening which we typically don't know if it is happening. And then it could really break the system it's a deal breaker for the system if you will. So I have a couple of examples you can see, you know, I mentioned optimum that that system that was, was awarding help to some kinds of patients with medical problems. Black patients were worried should be worried about accuracy they should be worried about false negatives, ie, they do need the help but they're not being offered the help. So the idea is that those are the red boxes they they sort of pop up very easy to find. And this is a, to be honest this kind of very, very basic analysis of the risks of this algorithm should have been known to the data scientists at optimum, if not the business people at optimum. So this is talking to me how few data scientists are asked to look into how does this work for whom does this fail, could this fail for black patients could this fail for older patients could blah blah blah. So that's the idea and this is not a complete matrix obviously, for example didn't write older patients and older patients are definitely a stakeholder group here. The lower example, what I like about this for this is for facial recognition obviously black women would be concerned about accuracy. But the truth is until we know how it's being used we don't actually know what else to worry about we don't know whether false policies or false negatives are good or bad for someone. Because we don't know the context, it's being used and sort of the one of the points I want to make about this matrix is that you actually can't fill it out, if you don't have enough information, and that's, that's already an answer. The answer is you, unless you can build build this difficult matrix and in a satisfactory way where the red cells are are addressed. You sort of cannot run the algorithm or you should not run the algorithm it's kind of a litmus test in that sense. Okay, so I did do that it took a few minutes but I'm going to just steal a few minutes of the q&a to, because I want to talk about explainable fairness which is this approach that we're developing for the regulatory audit. So the idea is, we're going to take, we're going to take our lessons learned from both of the other types of audits I didn't mention very much about the, about the adversarial audit but what's what's critically important, let me just give an example from the audit from the adversarial audit experience we've had. So student lending. The idea is and I'll do this example, more in detail in a few seconds but the idea is like, we are just the data people were data nerds right. We want to display what happened to students as they took loans. And we show that to the Attorney General who hired us. And then the Attorney General might say hey, this doesn't look good, because this outcome is much worse for black borrowers and for white borrowers and you know forgive me for focusing on race but that is mostly what I work on. And then, and then what happens is the, the company who the student loan company will come back and say oh, you know, it doesn't the outcomes are different for a good reason. And here's the reason and it's, you know, it's because we have to account for FICO score which is a kind of credit score. And then, and then our job as the data scientist is to do the math basically it's to say okay what does it mean to account for FICO score. And then, and, like, we'll figure that out and then we'll show another graph that it says okay here are the outcomes once we've accounted for that. And then the the Attorney General might say oh this still doesn't look good you need you have more explaining to do my larger point is that as a data person in this context. My job is as much as possible, just to sort of take the conversation between the regular the Attorney General and the company, and translated into statistics, like in a very simple straightforward way so that the conversation can go to the next And similarly in explainable fairness we're going to try to do that as well but instead of the Attorney General and one company we're going to be doing it between a regulator and an entire industry of companies. So we're going to be thinking about the regulators viewpoint where we we don't have to do it as comprehensively as we just explained how to do it with the ethical matrix because typically when we're talking about a anti discrimination law, the stakeholders are legal categories we know who the stakeholders are so it's much much less, you know, open ended, like we actually know what the rows of that matrix should be. We also know what the columns are supposed to be they're supposed to be discrimination, illegal discrimination. However, we are going to sort of dig down. Like, pretty far in trying to determine exactly what we mean when we say that. What does it mean to be illegally discriminatory and that is, that's the conversation we're trying to track with explainable fairness. The current situation and I sort of alluded to this is that, you know, we have basically lots of local governments in New York City and DC Colorado state. We have a city level that are asking for more accountability for AI in the regulated spaces typically lending insurance. And sometimes policing with facial recognition. The companies who are vendors for this or who are, you know, actually insurance companies are, or the lenders, they don't exactly know how to respond to these requests and they're doing all sorts of efforts to respond and essentially most of them are kind of lobbying efforts to make it seem like there's no problem here. And, and in response to the, you know, the states and cities are saying actually we want more information and so that's the current situation where we're trying to respond in DC and Colorado with new rules and and regulations around insurance and in New York City around hiring algorithms. But the critical thing is that the rulemaking hasn't been determined like we don't know exactly what this will translate to. But it'll definitely be more than the industry wants the industry really doesn't want to actually have to give up their data. And from all accounts, the data actually will have to be analyzed. We have this idea of a balancing framework so and I mentioned I alluded it to it with a student lending example but now I'm going to talk about it from a regulatory point of view. So the regulator might say hey student lenders as as a group, we're looking at your data, and we're seeing there's differences outcome by race, for example, it, the industry as a group can be expected to respond. So there's a good reason for that like there's a good reason we charge black people more for current car insurance as in general. And then the regulator will say well, what is that good reason and the industry will be like, because blah blah blah. You know, we have to account for driving record or something or we have to account for the type of car that people like that use or we have to account for where they live, and the propensity for crashes in those areas. The regulator will say, you know, basically the idea is like they have to make the case that this factor is legitimate, and that's sort of the technical way to describe it. And it's either legitimate because it's just a legitimate thing to consider or it has a sort of, it has a detriment but it also has a benefit and the benefit outweighs the detriment. And so there has to be some kind of actual balancing formula for that. So you'll see a lot of insurers talk about how, you know, they're just following the risk, but they also have rules against using race or even race proxies. So, so the idea there is, you know, well, how much more risk are you inferring from this new factor that you're using versus how much sort of unintentional racism is seeping into the system. And the idea is that that's a conversation and it's not a, it's not a math question. It's actually a legal regulatory question. As the mathematician in the room as the data people in the room, our job is to sort of, again, track that conversation, and sort of say okay well if they if they've argued successfully for for FICO score being acceptable for insurance, then we're going to have to work with that we're going to have to sort of redo our formulas taking into account FICO score. FICO score is a really contentious one I bring it up for that reason. You know, because it's not nominally related to driving, but there is a huge correlation between low FICO scores and high risks for for car insurance cost. And so that there's a huge fight in the industry about whether the weather insurance companies should be able to use FICO score. In particular because FICO scores also highly correlated to race. Anyway, the point is, the point is that overall we think this is the way to do it where the industry has to sort of come forward with a new sort of legitimate factor that they have to argue for each and every time. And the regulator gets to say okay we either accept it or we deny that as a legitimate factor. Once they accept it, we do another sort of loop it through the, what is it, what are the outcomes look like now by race. And the one of the great things about this is that it, it creates a positive feedback loop for the regulators and for regulation in general because what it does is it'll it gives the same test to every, every company who's, you know, presumably who's like using who's charging for car insurance. It allows the regulator to see, you know, for the current tasks, you know, accounting for the legitimate factors that have gotten through so far. So this is how is this industry doing overall, each company will be doing, you know, so well, other companies were doing better some companies will be doing worse. So who are the stragglers, who's doing really well, like what is the standard you can set. And so this is addressing two of the questions I mentioned at the beginning that were open. The first question is what does it mean to be fair. And I hope it's clear that what it means to be fair is very contextual. It, you know, FICO scores are highly controversial in car insurance, but they're not very controversial in lending. So you can't imagine doing this once and for all for all systems. No, it has to be done extremely carefully for a given context. That's number one. And number two, because of this sort of consumer reports view on the industry as a whole, where we see who's doing well who's doing poorly with respect to this test who's, you know, where the average company is with respect to this test. It also answers that question of what is the threshold, because you can't, you know, we'd all love to say the threshold is perfection, the threshold is complete equity but that's not where we expect to start. So instead, we expect to start where there is inequity, but we know that, you know, the stragglers could do better. And so the threshold can be set to be a relatively reasonable and yet high standard and it could be moved over time to become a better standard. So that's the, that's the, the, the thrust of the explainable fairness. I want to give a couple of examples just to sort of show you it in, in, in action and the first one is student lending I already mentioned this as an example of, of something that regulators could do. So, the first step in an explainable fairness is to choose an outcome of interest. So there could be a lot of different ones but so for example it could be like, do you get the loan offer yes or no that would be, that would be a binary outcome. What is your APR, like that would not be binary that would be, you know, a scalar, what is the you know, what are the consequences of late payment or default like what are the fees charged to you if you're late. So that's another kind of outcome you might care about. So once you've chosen an outcome, then you infer the protected class status like you infer race. And you say, okay, like let's say it's the binary outcome of did you get an offer. What's the rate of offers for black applicants versus white applicants versus Asian applicants versus Hispanic applicants. There's a huge difference. You'll be like, hey, you know, student lenders, there's a huge difference in who gets the loan. What can you, what can you say about this and that's when that that that sort of negotiation starts we're like oh, that's because we, we obviously don't offer loans to people with low FICO scores and then they have to make the case that FICO score is a legitimate factor. And that it is true when we redo our we redo our measurements account accounting for FICO, and let's say we still find problems and then the industry comes back saying oh that's because we care about the type of major you have in college. And then the question, very important question becomes, why is that acceptable as a legitimate factor why is that isn't that a proxy for race. So like how much more information are you getting out of out of that major question, considering that it is a proxy for race. So you have to be getting a lot, you have to be sort of getting a lot of predictive power out of this, this particular factor for it to be legitimate considering that it is a proxy for race. So that's the idea. An example would be for disability insurance very different context so it's going to be a different outcome of interest so claim approval did you get approved that is also binary so it's not that different but then the length of the initial claim, how many weeks of, of disability insurance are you getting offered. How many times you have to extend your disability outcome disability claim. And then again you once you've chosen the outcome of interest you infer the racial category or whatever the protected class could be gender. And then you measure the outcomes for different status so you might say okay for women, you're, you're approving at a much higher rate than for men. And then the answer might be well of course it we we automatically approved maternity claims and that that explains why women get approved so much more often. And if that's true, then we might account for that by for example taking out maternity claims from the analysis altogether and saying okay now, removing maternity claims, we see the following differences and in outcomes for men versus women, etc. And that is a general gist of the kind of thing we're talking about. I'll just finish by saying. The reason we talk about explainable for the reason I would call this explainable fairness just because sometimes people talk about like, explainability in the context of algorithms. We don't think explainability is what people want people don't want to understand machine learning. What they want to know is why it's fair to them. We think that this, even though it's a kind of a mess and it's a negotiation. At the end of the day will be the when people say why is this fair to me. The answer will be something like, well, once we've accounted for your fight go score and your maternity, and whether you're a maternity leave, we found that this is you know the outcomes are are equitable by this measure. It's a it's a mouthful, but actually it is the closest we can imagine to what fairness will actually sound like in such complex systems. The last slide I have is, I just want to compare what I've just mentioned to my understanding of European regulation around AI. And I might be wrong about this stuff so I'm willing to be corrected. There's a huge amount of privacy focus in European law, which I appreciate at some level, but the problem with that is that, for example, in GDPR. It's really hard to do the second step of explainable fairness, by which I mean where you infer race in order to decide whether a system is is treating people of different races in similar ways. It basically makes it almost impossible to infer race in particular to like, tag somebody's data, including their PII with inferred race or with a any type of race unless it is like self reported and given with consent and all this stuff, which is never true in the kind of work I do. And then there's problems straight up. And then there's another problem, which is that the inference method methodologies that we use the inference methodologies all based on the US census and like the US census is not perfect but it's really really good. So here we have enough racial data in the census to, to guess at somebody's race knowing their first name last name and address pretty, pretty well. And I could talk about the problems that our methodology has but the problem I'm have we have with European type versions of this is that some of the census is don't even collect race like I know in France they don't. It's like, it's really difficult to imagine how you guys could do an analysis like this considering that you just don't have the census data and you have these GDPR problems. Not to be completely negative, but the, the, the overall issue is that privacy, when you guys are, you know, is as extreme on privacy issues as as Europeans are. It becomes, it's like becomes an obstacle to fairness questions of fairness. And then finally I do want to commend the AI for the EU for really thinking through like this, this triage point which is like the very very first thing I realized when writing weapons destruction which is that we cannot care. We just don't have the energy time or urgency to care about every AI, or even or every automated decision system. There's just too many of them we have to care about the ones that affect people the most so this risk based AI bill is extremely important I really really think that's the right way to do it. I just think you guys need to figure your way around the privacy issues, and I'll stop there, and you could feel free to correct me on any of those issues I just brought up. Thanks Joyce. Thanks very much Kathy. You know what a really exciting kind of introduction to this explainable fairness, but as you say, it is very much interactive isn't it you have to know the context you have to know the issue. And in many ways, and I think you've covered some of the issues, you know, with the EU law but also, I suppose, what we've been focusing a lot on this is so called trustworthy AI had a lot of discussion and work on that, but haven't gone in to these other issues. And I just wonder how, because what you're saying makes a lot of sense, you hear people concerned about systems whether it's medical the criminal system, indeed facial recognition here. And it seems like a mammoth task to create a framework, which you have that is workable within this, you know, triage or risk based system. Yeah well I. I mean, one of the things that I mean listen, I agree. It's it's hard. I think the best metaphor I have is, is if you're, if you're imagining going into a into an airplane, and you look in the cockpit, and there's nothing there. You're like, whoa, there's nothing there. You know the dials in a cockpit, which measure wind speed and air pressure and altitude and all in how much gas you have. Each of them alone isn't sufficient to make you feel safe but as a dashboard view, like you're like, okay, like you can pretty much feel like if you have everything within thresholds of good below max above men, then we're pretty safe. The way I think of of the AI systems that are currently employed deployed is that there is it's like flying without a cockpit like we haven't figured out what dials to put we haven't figured out what the minimum maximum are. So basically, matrix that that you know that big grid that I mentioned, from my perspective is the way to this is a designing system for a cockpit so the red boxes that you eventually get to are the dials to your point though Joyce, it's work, you know it's not that, but we're, we need to do that. So it's like, you know if somebody complained oh it'd be a lot of work to design an airplane cockpit you're just like okay but let's not fly the plane until we do that okay. Yeah. So that's how I kind of feel about it. I agree with that perspective and I suppose what I'm trying to tease out with you, what would be the best way to create that understanding that you've presented so clearly. I think it's important to talk about to the public, because in a lot of the cases, people aren't aware, even those systems are there. And I just wonder, you know, is this a phased approach that with the public with the users with whoever. We need to create, what is it that we need to create awareness about these issues. So, because I think you're very good example Kathy about your son, and the Nutella and the vegetables is very good it hits it right on the spot. So, what do we do and I'll ask other questions of the but just on this first thing, how do we get people to besides reading your book and listening to this, how do we get people and indeed regulators to understand that complexity that you've, you've, you've kind of presented in a very clear manner. I would, I would argue, you know, when I talk about the need for triage. It's another way of saying that is, if we did this process. This design process for the cockpit on a benign algorithm, we would not find any red boxes. We would just be like nobody cares nobody cares nobody's affected nobody cares. And in fact, the only person that cares about my son's dinner is my son, right so, and me, but the public wouldn't care right so that would never be important enough. So it has to I guess my, because I'm my DNA as part journalists like the part of it has to be like, well, find a stakeholder and a concern that really impresses people like with its scariness. You know, with Facebook, the Facebook news feed algorithm, there's many, many, many to choose from like genocide in Myanmar, the Rohingya is one of them. Or, you know, young women killing themselves because of their body dysmorphia is another one of them like it's not that hard to find a stakeholder group and a concern that is that should be a deal breaker, or could be a deal breaker. You know, and don't we want to know the answer to that. And don't we want a dial in that in that cockpit. So my, my point is that like the triage, the triage work you're doing should be basically like find the reason we care. The story that narrative of like, we don't know if this works for, you know, women as well as it works for men but imagine it works badly for women and this is the this is the story that would would result or imagine it doesn't work for people in wheelchairs like what happened to people in wheelchairs they would never get a job you know what I mean, like you actually have to have that New York Times headline in your in your head that that sort of says yeah we can't do this while checking that we can't, we can't let this airplane up in the air if we don't know if it's going to fly. Yeah, but in a in a way and I take that point. How do we get that. So if we take one particular stakeholder group, the implications of that for the whole system is an actual fact isn't it constant dialogue, interventions, clarity, both for individuals as well as groups. So a lot of the cases you know you've given, you know with insurance say with with black drivers or whatever, they're basing you know it's either I understand on a simple risk analysis it looks as if this is happening and therefore going to do that, where, where very few of us understand that. So it means for isn't it transparency, it's nourishing around the values of which we're working on when we're measuring that. Again, I, I want to decouple any kind of technical conversation or jargon from this issue of what seems wrong to people. So yes transparency on that on that process of figuring out what are our values, but I, and transparency on like what we're trying to accomplish with this dial. And that's why explainable fairness is phrased in complete plain English right. You know, I'm not saying that everyone will understand exactly what I mean by saying accounting for FICO scores, this is a fair system, but people know what it means to account for something that's a that's a that's a simple notion. And then people, the idea I guess what I'm saying is like, when people hear that accounting for FICO scores they can be like wait that's not, that's not reasonable to me why should you account for FICO scores why should someone with a better FICO score pay for less for car insurance like that's a reasonable pushback. So and give them a give the enough transparency so that people can air their, their complaints or their agreements, but not transparency at the level of code or the statistics. So it's, yeah, it's about simple, simple, explainable and accessible understanding English so that we know about it. We've got a question here, Kathy from David Low. And again, keeping on the concept of explainable fairness. Can you talk a little about how the concept of explainable fairness can be applied to health care and shoe insurance at the higher level of health care status as distinct from race. For example, in Ireland, we had to abandon a model where health insurance was offered at the same premium independent of age, because the system became unstable due to the free rider problem. Is there a methodology that can be used to take the emotion out of the debate. I mean, I don't think that's emotion I think that's the problem between individual incentives and public and group incentives. So I would say the mandate for you required to have health insurance is the only way around that. And I'm sure I don't know what you guys ended up doing. But you know this is, this is the question to be clear this is the question about insurance writ large is like how do you pool insurance. If you have enough sufficiently pool insurance, so that it remains insurance, right because if only the people who are highly risky are in the pool, they're going to be paying stuff that is essentially self insurance like it will be unaffordable. And especially in the age of, of surveillance capitalism where we have so much information about people, especially in the states probably more information even. We can actually infer somebody's risk pool, very, very minutely, which means, if we didn't have the, if we didn't have the Obamacare right now, like our health insurance system would have totally failed. It's because we are required to be in the pool and because the insurers are not allowed to charge more for preexisting conditions those two things together that work. It's not, it's not about emotions. It's about raw facts. Yeah, and our system I think is completely different really here a very high percentage of the population are covered by state care. I mean for all the population are covered by state care, but also, you know, very high percent or think it's over 40% have have their own insurance as well. So it's quite, it's quite a different system but perhaps to go to another question and in a way it may follow on from this Luke Benz ask at what stage does should explainable fairness come into play. Should it be exposed or looking at results of algorithms in operation, should it be something that has taken into account, you know, during the design phase from the beginning. I certainly think I would prefer to fly in a plane that had a design cockpit from the beginning, but I guess right now we're dealing with the fact that algorithms are ruling our lives and nobody knows how they work and like if they are fair. So we're thinking of it as like for existing algorithms, here's how you decide whether they're behaving acceptably well. But of course, once those rules are in play, any company who wants to deploy a new kind of insurance algorithm will have to first, presumably, have the burden of showing that it complies with those rules. The longer term goal is to have a kind of FDA model where you're like you have to show that before you deploy this new system you have to show that it's safe and effective. And in a sense, is that having a framework in which all these algorithms actually look at and it went like your matrix. Is there a matrix there that you think could be established that could apply in most situations. No, that's the problem. The problem is the only situation where you're really reusing the same matrix over and over again is that regulatory perspective where you have a regulator keeping an eye on many, many, many players with the same type of algorithm. In general, if you have a new algorithm in a new context you need a new ethical matrix. It's, again, the ethical matrix you should think of is how you design the cockpit, how you design the cockpit for each new type of vehicle it's like you're not in a train anymore you're on an airplane of course you can't just translate all the knobs in the train cockpit to the airplane cockpit that's not how it works you have to give a new set of problems. So just like that, for an algorithm you have a new set of problems, but it's the same process where you're just like what could go wrong. Yeah. And just coming back to, you know, your concept of success, James Allen from the IAA has asked, how can companies and regulators assess how flawed definition of success could should be fixed. How can inadvertently flawed definitions of success be distinguished from deliberate use of algorithms through produce unfair results. That's a really great question. One of the answers I would have to that is the ethical matrix framework sort of insists that you answer the question what does it mean to work. And what does it mean to work is already really an interrogation into the definition of success that you've that you've created. You know, I think the most compelling example of a terrible definition of success is what we call in the states recidivism risk algorithms, which is like sometimes called crime scores. It is, it does not measure crime criminality. It measures the probability that you are going to be reassured years. And so the definition of success, there is actually rearrest within two years. But, but, you know, no, I don't think anybody really thinks that that's necessarily a first of all a good thing. It's a complicated answer, but I guess what I'm saying is like it's measuring the wrong thing. It's treated as if it's measuring criminality but it's actually measuring who is going to be profiled by the police. Yeah, you know, and so that's just actually, it's a profiling tool in itself for that reason. So it's just not answering the right question and question. Yeah. The language around its use is just so messed up, because people are contorting themselves to try to make it seem like it is measuring criminality. And they actually boils down to like what do we mean by successful prisons day. We don't know what we mean by a present for successful prisons day. Anyway, it, this is a really hard question. But, but it's healthy, isn't it? So what you're saying, you know, who's, who's, who's defining who's measuring, what are they measuring. And that, you know, yes. And Joyce, that's, that's kind of one of the reasons it's so critical in my work to bring in those, those who are impacted directly into the conversation. Because they need to say, I don't care what the company who owns this algorithm thinks success is, I'm going to tell you what failure looks like to me or what success looks like to me. And that, that voice is critically important to making sure that the overall system is balanced. And besides that, I suppose, and I think you're implying it, what impact it has on me as a person in the community or whatever, whether I'm a student, you know, what race or in hospital, all of that makes a lot of, a lot of sense. And from a public policy viewpoint, how can the various stakeholders, so you're saying they need to be involved, you know, and I think that's critical. How are their interests taken into account when algorithms are created and used? Can the subjects of whom algorithms make predictions be provided with rights to protect them? Or is it more the job of regulators or legislators to ensure fairness in algorithm design? Yeah, I think it's kind of both. I mean, my experience with working with the Consurance Commissioners is that on the one hand there are public conversations. Anybody could join them and make comments about, you know, what does it mean for an insurance algorithm and life insurance in particular to be fair. But typically members of the public don't join that call, right? It's usually lobbyists, but there are some particular, like, consumer advocates who join. So that's good to have their voices there. But at the end of the day, it is a relatively technical question and the regulator's job is to pursue the public interest and to make sure that the insurance industry is healthy. So they have to balance their stakeholders as well. So yeah, for all these reasons that it's become somewhat obscure to the average person, that's even more reason that we need to make it as explainable as possible at that level of what does it mean that this is working fairly for me. And do you think that then just to finalize, unfortunately, time has got up in the sky. Do you think then that the key question is that is that concept you have of fairness, what it means to individual stakeholders and how that can be used and understood perhaps understood first and used by the various in the various designs of these algorithms. Yeah, for me, I think that's, you know, we can kind of count on people who build the algorithms to make sure they're efficient and profitable. What we can't count on is that they will check that the stakeholders that care about the system overall are also being treated reasonably. We, you know, we need to build that, that cockpit, and I fully expect it to change over time and to start out pretty minimal and insufficient. But at the end of, you know, in the next five, 10 years, I expect to see a panel on racial fairness, a panel on gender fairness, a panel for each protected class. And like, here are the six dials we have to make sure things are working as, as, as expected and legally. And I expect that kind of panel view to be the case for all of the regulated industries hiring insurance credit and housing. And I don't see any other way of doing it if we're going to continue to use algorithms which we will. Matthew, thank you very much for your inspiring presentation. And I think for this very clear view, I think the idea of a panel, the cockpit, I, you know, those of us who are nervous flying can immediately see the relevance. And I hope we won't have to wait for 10 years to bring you back to discuss that and you could come back in person to talk to us in Dublin. And I think, seriously, to look at those panels to look at that idea of fairness and that design. And, you know, as we progress on our path which is quite different. I think your input will, in fact, cause people to ask questions before undertaking some of these issues. I know questions are being asked now. I think that work has made and will continue to make a great impact. So thank you very much for that. Thanks for having me, Grace. I'd be happy to come back. And thank you. And we will, we will think about that. Definitely. Same like Kathy O'Neill, we definitely want you back. Also to thank our audience for their participation for your questions. And I look forward to seeing you again at our next event in early December. And also I'd like to thank our IIEA team, Mark, on production, Hugh Murphy on communications and Seamus Allen, our digital policy researcher as well. So thank you all very much. And again, thank you for for a really inspiring presentation. Kathy really enjoyed it. Take care. Enjoy the rest of your day. Bye now.