 So Matt, today we're reviewing weapons of math destruction, how big data increases inequality and threatens democracy by Kathy O'Neill. Now, a weapon of math destruction or WMD is not a calculated gun or eye, instead it's a bad algorithm. An algorithm that impacts people negatively and often has unforeseen consequences. Kathy defines these WMDs as having three distinct characteristics. One, they're opaque. You don't know how they work inside, they're hard to interpret. Two, they're at scale so they're widely deployed in a variety of industries and even in government. And three, of course, they're damaging. They have negative impacts on people's lives. Before we talk about these WMDs, we should say a couple words about the author's background. So Kathy O'Neill started out as a math PhD and worked for a few years in academia before moving over to D.E. Shaw in 2007. For those of you who may not know, D.E. Shaw is sort of known as the Harvard of Hedge Funds. He's a company where Jeff Bezos famously got his start by working with early internet companies and deciding that Amazon was a good idea. So of course, since he started in 2007, very soon after the financial crisis hit, she wedded this out for a little bit, then started working in tech, but really she became famous for her blog, Math Babe. Yes, Kathy O'Neill has built up a substantial readership through Math Babe. This kind of makes me wish we'd named the show Future Dudes, but alas, the name of the show is Random Talkers. This voice is Adam, the other voice you hear is Matt, and this is a program where we discuss how technology is shaping our future for better and worse. You can hear us on YouTube, you can also hear us in podcast form via iTunes and SoundCloud, and we love to hear from you, so leave a comment, leave a review, suggest any books you might want us to review in the future, and make sure to subscribe too. Now, the first WMD that Kathy O'Neill mentions in this book relates to teachers. Specifically, there was a program called Impact that was introduced in Washington DC schools a few years ago, and this essentially tried to measure the impact of teachers by assessing how much they were able to shift students' standardized test scores. Yes, this is essentially Sabre metrics for teachers. However, in this case, we're not comparing the performance of baseball players, we're looking at teachers. So the way this model works is it looks at the standardized test scores of students in the previous year and creates an estimate of performance in the current year. Then, teachers who are able to exceed this estimate stand to receive a huge bonus, and those who underperform may be fired. So I think we can agree that trying to judge teacher quality is an admirable goal. You want to reward the really good teachers and maybe get rid of the very worst ones. The only issue here is that this model was absolutely abysmal. It was a truly horrendous model, and Kathy O'Neill details this in the book. One of the main issues was that teacher scores were essentially random from year to year. The score was from 1 to 100, and she would find that a teacher who scored a 6-1 year would suddenly receive a 96 for next year. And you might say it's unlikely that a teacher could go from being among the worst of the worst to suddenly the best of the best, and that was true. You know, the score was being driven by essentially random variance, whereby a teacher might only have a class size of say 25 to 30 students. There are a whole lot of factors beyond the teacher's control that impact these test scores, a whole lot of variability in these test scores, and this model was essentially reporting back noise rather than any true estimate of teacher quality. Yeah, I would point out it wasn't completely random though, Adam, because there was a huge incentive for teachers to cheat in the system, whereby teachers know that they're going to receive a large bonus if their students perform well. There was lots of cases found in the DC area where teachers were actually changing the standardized tests of students to be better and therefore receive a large bonus. Yes, if you were a particularly unfortunate teacher, you might find that your batch of students arrived having earned miraculously high test scores the year before, and basically putting you in an impossible position where you're going to really struggle to raise those scores enough to get a good metric from the impact system. Now, I think this example of WMD really draws in one of the key lessons from Kathy O'Neill's book, which is a bad algorithm. It's not a bad algorithm necessarily, because it's super accurate, you know, minority report style. Oftentimes, it's a bad algorithm because it's a really bad predictor and yet is blindly taken on faith by decision makers and used to have these really important choices made about people's lives. You know, if you were a teacher who's unfortunate enough to lose your job because of the impact algorithm, a lot of times you lost it basically because of random noise in the data. Yeah, you know, one thing here that's really evident to me that I don't necessarily know is called out in these words is that models that you cannot test or models that you cannot verify are by their nature very, very dangerous because if you can't understand the outcome, you can't correct any errors that may exist in the model. Right. You could ask a hundred different educators what their opinion of teacher quality is and you probably get a hundred different answers. And assessing causality is always going to be a tough issue, you know, even for state-of-the-art machine learning models. Now, Kathy O'Neill goes into a number of other examples involving government and WMDs. You have things like the compass algorithm, which is a recidivism predictor, but we've actually talked about it on the show before, which basically aims to estimate how likely a prisoner is to reoffend once they release from jail. And for various reasons, this algorithm has all kinds of issues too. Yes, it can't go unnoticed that all of these algorithms and models that are called out tend to be ones used by the government. I think one of the reasons why is a key trait of a WMD is, again, that it cannot be validated. And the idea that we would take a violent criminal, release them on the street and just watch what happens just to see if our model is well calibrated, would have people up in arms. So it makes a lot of sense that these show up in the government. I think another thing that appeals to kind of both sides of the aisle is this idea that WMDs can be impartial. Now, on the Republican side, they would say there's a lot of cronyism in the government, it's inefficient. Let's bring in this cold hard math to make the decisions and get rid of all these government bureaucrats who perhaps pick the wrong thing. On the other side, you're going to have Democrats who say there's all kinds of inherent biases in our thinking and therefore a model can solve some of these problems we would see in society. While I would say that bad algorithms can be used in business, they usually have some sort of relevant metric. So I in general feel a lot more comfortable with a business using these algorithms than the government. However, the private sector does not get let off the hook in this book. There are a number of examples where private companies have introduced very damaging WMDs. So one example that Kathy O'Neill discusses in detail is the advent of personality tests to judge potential applicants when large employers are looking to hire. So there's a story in this book of someone who was diagnosed with bipolar disorder but was untreated for it who then finds they have a really, really tough time getting any kind of job, even like an entry-level job at a retail or a grocery store because they are unable to give good answers to these mandated personality tests. And so Kathy O'Neill argues that these are WMDs and that they dismiss vast swathes of potentially qualified applicants for some kind of opaque personality-based reasons that applicants simply don't understand. Yeah, you know, the interesting thing here that I didn't necessarily know about is that a lot of these tests seem to focus on questions that don't really have a right or wrong answer. You know, in my familiarity with this type of question, usually you can tell the employer wants to know that you're not going to steal from them or you're not going to do something else, you know, that would be bad. But in this case, the employer will say, well, would you rather steal or do something else? And so you're basically forced to admit to something bad on this application. I thought this was really interesting. And to me, though, the question really comes down to why companies can do this. And I see really that being the large pool of undifferentiated labor. There's this huge pool of labor out there in the United States right now, and companies kind of have their pick and they can force people into taking these jobs because there's no other option available. Right. I remember back in the day taking the Walmart Q&A in high school when I was a pliever and thinking that some of the questions were really, really stupid and that they basically ask you to choose between two bad decisions of say, you know, are you depressed or are you angry? You know, pick one based on a particular scenario. And I'm glad to see my opinion validated of these tests really aren't all that useful, at least in Kathy O'Neill's opinion. Now, like you kind of mentioned, there is sort of this economic reason or an economic driver for these tests. And the companies like say Walmart and Kroger have large swathes of applicants for what are essentially, you know, pretty unskilled undifferentiated jobs. And for that reason, they're looking for a way to quickly cut down these pool of applicants and, you know, a quick personality test that, you know, doesn't really cost a whole lot to administer is a really, really easy way to do this. Now, the economists I think would argue that if these personality tests are WMDs and they're kind of dismissing qualified applicants, you know, over time, that is a market inefficiency that should be able to be exploited by a smarter employer that, you know, does not waste time on delivering these tests or delivers, you know, a better version that doesn't have anything to do with, you know, employee's personality traits. Yeah, right off the top, I kind of thought, well, you know, if Walmart is administering a bogus test, then Target should be able to swoop in and take lots of cheap labor off the market. You know, one of the things I thought about though a lot with this test is really, it's not necessarily that it's a WMD and that it's bad. It's really that there's just so much labor out there that even if employers randomly throw away 25% of applicants, they still have more than enough to fill their ranks. Right, these companies have all kinds of ways to cut down on the number of applicants they're having to go through. You know, one method is just to grab a stack of resumes, throw them down the stairs and interview the people who reached the bottom. Yeah, but who would want to hire an unlucky applicant? Right, I mean, it really sounds like a sound methodology. Matthew O'Neill is really arguing that these WMDs are damaging because they are essentially not choosing the best possible applicant for the job and discarding someone for what is an unreasonable reason. She says the personality tests have kind of come in because things like intelligence tests were actually outlawed as being discriminatory and it does seem like these tests are of pretty dubious value. Having said that, I would say from the perspective of companies who are administering them, there should be a pretty easy way to judge if a personality test is working, which is to simply to have a holdout sample, to have employees who are hired without the personality test being considered and to basically compare their performance to those personalities who are hired taking into account the test and seeing if it actually makes any kind of difference whatsoever. Yeah, again, the WMD here is not so dangerous to me personally. The idea though, if we don't validate it, it could sort of run off the rails, I think is what you're getting to, Adam. Right, but at least in the private sector, you would hope there'll be some way of being able to kind of measure these WMDs against some sort of business outcome that makes sense, although you could argue that the business outcome might not be good for society as a whole. But when it comes to hiring, it's hard for me to really accept that these personality tests are any worse than what a company might have been doing beforehand. And Kathy O'Neill kind of mentions in her book that a lot of times before this house hiring was basically done on who you know, if you can get a referral, say at Walmart or at Kroger, that will kind of give you a foot in the door. And for that reason, it seems like hiring has always been a completely arbitrary process. It's always just being kind of luck of the draw, who your parents are, who you know, who your friend knows at a store, for example. And I don't think personality tests, though imperfect, are necessarily making any worse decisions than were being made previously. Yeah, one thing that's also called out in the book, though, is that these WMDs tend to adversely affect the poor. The idea being that if you're wealthy, you're in the upper tiers of society, people will look at your application individually. They'll talk to you, they'll understand you as a person. However, if you're applying for a job at Walmart, you're simply a number in the system. Do you fit the criteria or not? If you don't, you're thrown out. And really, the thing that Kathy focuses a lot on is that you're never told why you failed. You're never given any reasons why the system rejected you. Yes, if we think back to kind of her first tenet of what are WMDs, the first tenet is that they're opaque. You might be rejected by a personality test and you have absolutely no idea why. The company isn't going to get back to you and say, these are the particular questions on the test that you scored badly on, you're basically rejected and you're given absolutely no reason for that. And yes, like you said, Matt, there's absolutely no human you can turn to. The theme of social welfare is kind of very prominent in this book, I would say. If we think back to Kathy O'Neill's kind of background, you mentioned that she was working at DE Shaw, which is a hedge fund. In fact, after she stopped working there, she actually got involved in the Occupy movement around the time of the financial crisis. And she really, really, like you said, makes a strong push to say that WMDs, widespread algorithms at scale, are bad for the poor, they're kind of discriminatory. They basically feed these pernicious feedback groups where people who are in bad situations are basically put in even worse situations because of an algorithm's predictions. Yes, there's a ton of examples of WMDs disproportionately affecting the poor. So right off the top is predictive policing, the idea that if there are more crimes in a neighborhood, police are sent to that neighborhood. Unfortunately, cops oftentimes end up making a lot of minor arrests. So for example, police are sent to a neighborhood to stop violent crime. But what they end up doing is arresting lots of people for minor drug offenses that weren't really part of the model. Unfortunately, this is going to train the model and sort of create a feedback loop, if you will, whereby there'll be more and more cops into this neighborhood because they're making lots of these minor arrests. And in general, that's going to affect the poor disproportionately. Now, the next is after you've been affected by predictive policing is recidivism. So as we mentioned, if you have been put in jail and we're trying to decide if you should be released, one of the things that will go into this model is the idea of what neighborhood are you from, who are your friends, who are you associating with in general. And because you come from a low economic background, we may actually keep you in jail longer than someone at the upper end of society. Another idea is one I really dislike, which is for profit colleges. So for profit colleges are specifically going after poor applicants who lack the knowledge of the system. For example, they don't really understand that in general, an employer considers a for profit college about on the same level as a high school diploma. It doesn't really add a lot. But by being able to, you know, deliver ads to these people for profit colleges are trying to grab these applicants, bring them into the system and really not telling them a lot about what's going on. And then finally is the idea of a credit report. So everyone knows about credit reporting and a FICO score, but people who are more advantaged economically generally can correct any errors that may appear on this report. They probably regularly pull their scores or have some sort of program that looks at it regularly through a credit card company or something like that. People at the lower end of the economic ladder don't do this at all. They don't understand when there are mistakes on their credit report or even what their credit might be. And so this is another one that really affects people differently based on their income levels. Right. There's all kinds of examples that Kathy kind of goes into here of how the poor are basically being screwed by these WMDs that do not have their interests at heart and are basically making their lives harder. You know, I think a fantastic example that she goes into in this book relates to operations research basically around how do you schedule a workforce optimally. And basically one of the tenets that kind of came through from operations research is that you need very flexible scheduling. You know, you need to be able to adjust the staff at the store to kind of go up in peak times and go down in, you know, when people aren't really visiting. And for that reason, you'll have a lot of employees who are basically on call and will have to go in with extremely short notice to a store and often won't have a lot of time between shifts. So there's one concept that comes up, which is called copening, which is when you basically close the store one night and then you have to be back there to open it the next morning, which is obviously an absolutely brutal schedule for an employee who doesn't have enough time to sleep, to spend with their family or anything like that. Because the operations research algorithm has basically decided that this is the best schedule to maximize efficiency, but it doesn't take into account employee welfare at all. And I think what Kathy and Neil would also say in addition to this is that there's a societal cost that we are kind of bearing the brunt of that these algorithms are generating. So the scheduling algorithm is kind of working out great for the company's interest, you know, it's deploying resources efficiently as they see it. But by, say, creating employees who, you know, don't have enough time to stay home with their kids, you know, get proper health care, anything like that, you're basically imposing a burden on society, you know, a negative externality and kind of economic parlance that is going to have to be paid for by society rather than the company itself. And that's kind of a call for why we need to be cognizant of these algorithms and potentially step in to make sure, you know, companies aren't exploiting their employees of these schedules. Yeah, I agree with Kathy that things like cloapening are really bad for society. I think the place I might differ a little bit is the solution. So I don't necessarily agree that simply closing off this loophole in these models or telling models that they can't do this kind of thing is a great idea. In my mind, this phenomenon occurs because there's a huge supply of undifferentiated labor. And, you know, if you were going to ask me what's been more harmful to WMDs or something like NAFTA, I would have to say NAFTA has been far more harmful to people in this economic rung than they would have otherwise been. So, you know, for me, I personally would say that one of the ways we could get out of this is perhaps more training, things like that. I would be 100% in favor of restricting some of these practices, but I'm not necessarily sure that just putting in a hard restriction on cloapening is the answer because the problem is much deeper than the model itself. Yes. There are some other examples which Kathy O'Neill gets into here where I would say I'm less sympathetic to her argument that current algorithms being used WMDs. And a lot of this has to do with looking back at history and seeing how much worse things were before any of these algorithms were invented. So, one of the areas she talks about is kind of finance, insurance, credit lines, whereby she discusses how companies use algorithms to basically judge high risk people in certain areas and not give them loans or insurance or whatever they might be selling. Now, this practice is absolutely nothing new. It goes back all the way to, say, the 1930s when a concept called redlining occurred whereby companies would essentially draw red lines around neighborhoods that they would not offer their financial services or their insurance products or whatever it is too. And this practice was actually coined in the 1960s, but it kind of went on for decades whereby if you say lived in an African American area, you would be denied access to certain financial products. Now, how does this kind of relate back to a present day? Kathy O'Neill basically says for companies are using things like zip codes to determine someone say default risk on a loan. And because of that, you know, if you again live in a certain area today, you might not be given such a good deal because of where you live or, you know, proxies a company might have about you. Yes, I agree with you, Adam. I think one of the areas we are really ignoring here in this discussion is the counterfactual, which is what would have happened if we didn't have these models or what even happened before. And there's no doubt in my mind that redlining is much worse than what we have today. I think we oftentimes like to focus on the individual misses and not see the improvement we made as a whole. Now, I would also say that in this case, you know, if we're talking in the realm of credit scores, one of the things that has actually been really good is that instead of discriminating on broad demographic targets, we can actually go down to the individual level and ask is this individual right for the loan. So I think I disagree with Kathy, mostly because I think these models are sophisticated enough now to actually discriminate based on the individual's behavior to make the right decision. Right. I think a lot of the examples of the harm that is being delivered by these algorithms in the book is because these algorithms are simply not good enough. You know, they're still reliant on proxies like, you know, neighborhood to judge someone's credit risk. And because of that, they're necessarily passing over a bunch of people in the neighborhood who should be qualified, but the algorithm is missing. One thing I would say here is that, you know, to kind of put on my economics hat again, you know, I know Econ as a field gets dragged incessantly. We'll go back to the theory again here. And that's to say that if you kind of reduce some of the regulation on financial products, you should get new entrants into a marketplace who are going to be able to chase out the inefficiencies of, you know, traditional companies who are relying on these broad algorithms that basically right off entire areas of the country. So for this reason, you know, I don't think, you know, kind of over regulation of financial products is necessarily a good idea because then you're just going to be stuck with a kind of traditional incumbents you have today who are never going to be able to deliver a good deal to people. Yes. So I think what you're talking about here is this uncanny valley, if you will, whereby the models are accurate enough to make some broad prediction, but not accurate enough on the individual level to be able to differentiate individual people. And I think it's kind of worth discussing what it would take to move some of these WMDs, weapons of math destruction into something that is actually kind of broadly beneficial, at least in in Kathy O'Neill's view. So the three tenets, once again, of these WMDs are opaque, they're at scale, and they're damaging. Now, the opaque part of things I think is always going to be difficult to fix, at least in the private sector, because companies want to hold on to their IP, you know, if something like a personality test for employers, is publicly visible, well, then it's easily gamed. But I do think you can do something with the third component, which is kind of the damage portion of things. And Kathy makes a good point here that big data is not intrinsically good. You know, a big data model is simply going to minimize whatever objective function you give it. And it's going to be something that most likely in the case of a private company is good for a company not necessarily good for society as a whole. Yeah, a model simply wants to minimize whatever that objective function is, it can be anything at all. And if we don't tailor that very carefully, we can wind up with a lot of systems and solutions that we don't like. So we have to think very carefully, and I think I agree with Kathy here that data scientists need to hold a lot more responsibility around the types of models and the questions they're asking. Yeah, don't just be like, this is a cool technique, we should use it, think about like what the actual outcome is going to be on the people who are affected by the algorithms decisions. But, you know, there are examples in the book, which I'm somewhat dubious of, you know, kind of designed I think in Kathy's mind to kind of illustrate just how bad these WMDs could be. So one of the things she talks about is hypotheticals, what if we had had WMDs say in the 1960s, you know, a WMD, for example, that is designed to, you know, admit college applicants to prestigious universities, rather than kind of admissions officers, you know, going through applications, you know, reading personal essays, you know, interviewing candidates, what if we just had an algorithm to it. And Kathy's kind of contention is that you would basically see very few women go to prestigious universities at any point, because the model would learn that men, you know, as they had before, were the ones who went to prestigious universities. And so gender on an application form would be kind of the determining factor of whether or not a student was admitted to a college. So she says this is kind of an example of a negative feedback group of WMDs. But if you had had these WMDs in the 1960s, you would have seen very few women ever go to prestigious universities. Yeah, this example confused me a lot. So as a data scientist, one of the things that was glaringly obvious to me was that, you know, even if there were only very few women admitted to these universities, if they performed well, or even let's say they performed better than a certain percentage of students at those universities, the model would really quickly pick up on that and begin admitting more women. So it's not necessarily true that the model is stuck in, you know, whatever rut it may fall into, it's going to adapt to the data that's coming in. Right. You know, if I was kind of defining a WMD, I would say they're at scale and they suck, you know, they don't make good predictions. They miss, you know, obvious causal factors that should be taken into account. And in this case, you know, a model will look to kind of a GPA distribution of prestigious universities would say that, you know, a lot of men perform really, really badly once they get to whatever school they're going to. And for this reason, you know, gender should not be the single determinant of the admissions process. Moving forward with WMDs, there's kind of a limit, I think, to the effectiveness of any one algorithm could ever actually have. We talked about this when we reviewed the master algorithm by Pedro Dominguez, which basically discussed the search for kind of a hypothetical, you know, all knowing algorithm that could extract the maximum amount of knowledge from any data input that it was given. And I think what would kind of disappoint a lot of people is that if you took the data feeds that are being used to train a lot of these WMDs, and you actually had a perfect algorithm, your decisions actually wouldn't be that much better. There's only so much knowledge you can extract from, say, knowing a person's, you know, age and various demographic variables and trying to make a credit decision. You know, there's only so much of those kind of typical variables could ever tell you. And in order to actually generate an algorithm that made kind of perfect predictions and was all knowing and could avoid, you know, discarding anyone who would have been a good borrower because of some demographic factor, you would actually need an incredibly personal data source. You would essentially need no privacy whatsoever from the applicant because you need to extract so much data from them to be able to feed this perfect model. Yeah, one of the two realms that's called out is the NBA or MLB in terms of their look into sports analytics. But you know, my thought here was that these are really constrained problems. You need to be good at basketball or good at baseball. That says nothing what, you know, somebody needs to be good at paying their bills on time or being effective at their job. So it's all very difficult to me to compare the two and say they work. And the other thing that I would call out is that if you ever watch the combine or the draft process, you'll realize that these interviews and these professional teams that are interviewing potential players are extremely intrusive. They consider no question off limits and they will ask a potential athlete absolutely anything. And so in that regard, I thought it was kind of funny and I think she may be ignoring a little bit of what goes into becoming a professional athlete. Right. Baseball in particular is kind of held up in this book as the gold standard of what sort of good analytics and good algorithms should be. But like you said, baseball has all the data. They have all the relevant data on the player for the most part to decide, you know, if they're going to be a good performer. And yet they still make tons of mistakes. You know, teams still hand out free agent contracts and end up looking absolutely abysmal. You know, a year later, you end up with situations where things like pitch framing, you know, the value of that has failed to be measured over a number of years and only recently has kind of come to the forefront of analytics. There are all kinds of mistakes that are made, even in a field like Sabre Metrics where all the data should be available, the smartest minds, Nate Silver even once worked. And they should be able to do a whole lot better than you might expect. Yeah, before the burrito bracket, he was actually doing a lot of analysis of Major League Baseball. The other thing here that I'll call out is that Kathy O'Neill complains a lot about the over-generalization that happens in models. The idea that, you know, we take data and we apply it to broad swaths of society. But then on the other hand, she also doesn't feel overly comfortable with increased data collection methods. So for example, adding a monitor to your car that will give you a potentially cheaper insurance rate at the detriment that you'll have to give all your data over to the insurance company. And, you know, to that, the statement's really, you can't have it both ways. Either we have to share more data, we'll have less privacy, or we can make more exact decisions. And it's one or the other, and there's kind of no in-between here. Right. You don't want to model this sort of discriminatory on some factor like race or socioeconomic status, but I don't think many of us also want to give up the amount of data that we needed to create, you know, a near-perfect algorithm. One take, you know, I have what I think Kathy O'Neill would probably disagree with vehemently, is that the answer to a lot of WMD type issues is to loosen the amount of regulation in industry rather than increase it. You know, you look at a lot of the WMDs we've talked about, they're in areas like government or finance. In the case of government, you know, there's essentially no competition, you know, the government decision makers can install whatever algorithm they see fit for the problem area. In the case of finance, that's a highly regulated area where it's difficult for a new engine to come in. And I think that level of regulation actually contributes to kind of a prominence of WMDs in these cases, because you don't have, you know, competitors coming in with better models, more accurate models, kind of more fine-atuned models to drive out kind of the glaring inefficiencies of the models that are being deployed right now. Yeah, there's a couple sections in here where Kathy O'Neill starts to really feel like an academic to me. In particular is one example where she talks about the inefficiencies in insurance rates for car insurance and points out that there are lots of instances where people are actually being penalized by their low credit ratings, even though they have really good driving records. And she kind of uses this as an example to say, look how harmful WMDs are, and points out that the company has no incentive to change the model because they can make all this additional profit from these people who have low credit scores, but are indeed good drivers. Now the thing that sort of jumped out in my mind when I read this is like, no, that's not true. Like, if there are drivers out there who are good drivers who will not get into accidents, but other insurance companies are scoring them poorly and therefore charging them more, there will be opportunities for companies to come in and gain that additional profit. So I kind of think here is one where, you know, if you have an open enough marketplace, there should be some opportunity for companies to correct inaccuracies in these models. Right. Kathy's major point here is WMDs that are driving fabulous profit margins are likely to sort of multiply across industries because companies are going to see, you know, how these opaque, you know, scalable algorithms doing so well for certain companies. But I think when you think about competition, I would say, you know, an algorithm that is generating high profit margins is charging customers too much and is likely to be driven out by a better WMD that is going to come in and offer, you know, better in terms of the customer's point of view. It's going to offer the customer a lower price and drive those profit margins down to zero. But, you know, to do that, you need less regulation, you need competition, you need people to be able to deploy these better algorithms, you know, in the field. Yeah, one of the questions I have here is, you know, I in some ways agree that some regulation is probably needed. However, I would say that I don't trust the government at all to get these regulations right. This is a really complex field. There are a lot of researchers, data scientists out there who work in this who don't understand all the implications for their model. Then on the other hand, do I trust someone in Washington who has never run a model in their life to make this decision? That's where I start to feel a little bit queasy about the idea that, you know, government regulation can solve this problem because I feel like this is a very technical, abstract topic that's really going to be hard for a lot of people to get their heads around. Yes, they're having said that, you know, I do think this book is kind of a sterling call to data scientists, like we've mentioned, to actually think about the impact of what your work is doing to the world. In a lot of ways it's called for ethics more than a call for better algorithms because a lot of the examples that are kind of mentioned in this book aren't necessarily kind of algorithmically related. There's a long section in here which I very much agreed with about the unfairness of stop and frisk policies in New York City, whereby stop and frisk was a policy, basically, example of kind of broken windows, zero tolerance policing, where cops would go around, find anyone who looks suspicious, pat them down, you know, see if they had drugs on them, weapons, anything like that. And it would basically jack up the arrest rates massively in poor areas with the idea that if you crack down on these kind of very minor crimes, you would also bring down the major crime rate. But what Kathy kind of says, and I think she's right on this, is that you get a very unfair justice system where people in rich areas are not being subjected to, you know, the same searches. And if they were subjected to the same searches, they will probably turn against stop and frisk very, very quickly. But stop and frisk itself, you know, it's not an algorithm. I don't think it really qualifies as WMD, but it's an example of something where, you know, you do need to consider this principle of fairness and ethics in your policymaking decisions. Yeah, I totally agree. I think this is an area that is sorely lacking in data science today. I don't know about you, Adam, but I know I never had a single course about how to ethically use data science that I'm sure are probably now starting to pop up. So I think we need to make a lot of progress here as an industry and as a field. And I think that will come. The other thing, too, that you called out that I think is really important is this idea that one segment of society should not be imposing models and decisions on another segment of society of which they themselves are not subject to. So for example, it's used in the book, you know, you should not say that a policy like stop and frisk is a good idea unless we're also willing to do that in wealthy neighborhoods. And so the idea would be, well, if you started doing this in the Gold Coast neighborhood of Chicago, you'd actually get lots of these minor drug arrests. It would just be from wealthy people now instead of from the poor. And so we need to be very careful about how we decide these models are going to be applied if they're applied unfairly across different segments of society. Yeah, I think the application is kind of the key thing for these algorithms. There's one sort of telling example in the book where Kathy talks about a 2013 algorithm in Florida that was basically designed to find child abuse victims. You know, kids who are having a bad time at home, maybe had terrible parents and basically used to kind of funnel resources to these cases. Now, what's interesting here is that you might think of this as kind of a classic WMD. It's opaque. The parents who are being involved certainly don't know how the algorithm works or how it's targeting them. You know, it's being used as scale. It's being deployed in a state. And it's certainly dangerous in the case of, say, a false positive, you'll be taking a child out of a home where there's nothing wrong with that home situation, but you are a separating parent from child. But Kathy kind of actually says, no, this is a good algorithm because it's being used to kind of funnel resources in a positive manner rather than just kind of taking kids out of a home. And to me, I think a lot of what that says is WMDs or the definition of WMDs is not about the algorithm itself. It's just about how they're kind of applied and what they're used for. Well, you know, we can't talk about this without now mentioning China's social credit score. So Adam, I would say this would be one area where it seems like you're pretty sure is going to be used incorrectly. Yes. Yes. I like to rail against the AI researchers who are not only much more highly paid than I am, but also have the potential to develop these massive algorithms that are going to impact people's lives in potentially very dangerous ways. You know, when we talked about China's social credit system on one of our earlier segments, I really mentioned how researchers like Rosa Baidu need to think about if you're developing this amazing, say, facial recognition system, you know, what are the kind of the surveillance applications of that? You know, could it be used to identify dissidents? Could it be used to put people in jail? And are you really comfortable with your work being used that way? And I think it's going to become more and more important for researchers in the public policy areas and private industry to really be cognizant of what they're doing. And I think you see a lot of kind of the data science leaders now sort of waking up or sort of becoming more vocal about ethics, I would say. You know, there was a nice tweet right earlier of Hadley Wickham tweeting out about papers and presentations, looking for votes on ethics and data science. And I think it's really good that people are starting to take this issue kind of seriously. Yeah, I would say the ethics section is really the gem of this book. Timmy, as you pointed out, Hadley Wickham very much interested in this. And I think it started a larger conversation about this. And the suggestion really given to us in the book is that we should think about the relative harms these models create. For example, we need to weigh the false positives and the false negatives to understand the outcome. Now, it's somewhat unclear how this would work in practice, but today we're letting all the burden fall to researchers. And I think the reality is we need to put a lot more responsibility for them to do things correctly or to have some sort of overarching industry standards for how they'll think about the data they're using. Right. So I would say overall this book does a really good job of reminding data scientists that they actually need to be ethical in their practice. Which, you know, is a lesson that perhaps shouldn't need to be taught, but nevertheless, I'm glad that Kathy O'Neill kind of covers it. In her book of these WMDs, I don't think Matt and I have necessarily been kind of convinced about, you know, the dangers of some of them, or at least that dangerous relative to what came before them. But nevertheless, I think we would agree that, you know, a model that is widely deployed and isn't accurate or isn't verifiable, doesn't have a good positive feedback loop to kind of update itself. You know, particularly like we saw the teacher example is a really bad thing for society and we should try and kind of root these out. And even in the private sector, you know, if an algorithm is, say, making people's lives miserable, like we talked about with reopening and kind of the scheduling algorithms that are out there, you know, those are a pretty bad idea too. So overall, I would highly recommend giving this book a read, you know, it's a pretty quick read, it's a pretty easy read, but I think you'll learn a lot. This has been Random Talkers, once again, available on YouTube, SoundCloud, iTunes, all kinds of places really. Make sure to subscribe, leave a review, leave a comment. Thank you for listening and we'll see you next time.