 Thank you, I'm really psyched to be here and I'll tell you as we go through the talk why I'm particularly happy to be talking to you today. As was mentioned in the intro, I've been at this for actually nearly coming up on 30 years. I started in 1991 in El Salvador building databases and I think like a lot of data people, my career has gone from sequel to machine learning. I mean, that's kind of the sort of the arc that a lot of us have and in the course of working through with these various partners, whose logos I'm putting up behind me, their truth commissions from places from El Salvador to South Africa to Sierra Leone to East Timor, to international criminal tribunals and I'll talk about a few of them, to local groups, groups looking for their lost family members who've been murdered and their bodies hidden. One of the things that we work with every single time is to try to figure out what the truth means. What is it that happened and how can we use that to try to bring an end to this kind of violence? But to do that, we get to use all these really cool tools and I've never shown these slides before. I give talks like this a lot but I haven't had a chance to thank people who write the back end stack, all the stuff that we use every day, thanks so much and the front end stack, the stuff that we use to do the analysis. I really appreciate it and so if any of you have committed to any of these projects, thank you very much. I really, really appreciate it and my crew does to my team. So when we do this kind of work, we're always facing people who apologize for mass violence. They tell us grotesque lies that they use to attempt to excuse this violence. They deny that it happened, they blame the victims. This is common and this is common of course in our world today. And what human rights campaigns are able to do, the reason that we're sometimes successful, is that we speak with the moral voice of the victims. That moral voice is founded on our claim that what we're saying is true. We have to be right and in particular, statistics have to be right. And by statistics, I want to include machine learning, which I'll go along with some of the Stanford folks and call statistical learning. But we have to get it right and I want to explain a tiny bit about what I mean by right and what happens when it isn't right, how that can go horribly wrong. So I'm going to give you three examples and let me start with the trial of a really, really terrible war criminal. I mean the one on the left in white, not the other guy. I didn't get to try him. But I did participate in the trial of former president of Chad Hush and Habre as an expert witness. And what we had in that trial were thousands of documents that were originally found by a researcher from Human Rights Watch, Reed Brody in the top photo, as a pile of trash discovered in an abandoned prison. And when the documents were reviewed, they turned out to be the operational records of the secret police. So those documents were cleaned up, they were organized, and we took about 12,000 high-resolution photographs of those documents. And what we did with the documents in particular is focus on a particular kind of document, the daily situation report. In the daily situation report, we had the number of prisoners that were held at the beginning of the day, the number that were held at the end of the day. And the differential, the delta between those two values is accounted for by the number who were released, new prisoners who were received, prisoners who were transferred to other places, and prisoners who had died through the course of the day. If you divide the number of people who died through the course of the day by the number alive in the morning, you get the crude mortality rate. And this turns out to be a crucial piece of this analysis. So first, you can just plot over time the number of prisoners held by day. And you can see that at one point there's an enormous influx of new prisoners. Most of these prisoners are prisoners of war in the war that Chad was waging against Libya in the desert. And so this status as prisoners of war turns out to be critical for the legal regime that we can bring to bear in the trial of former President Habre, because the crude mortality rate that I mentioned earlier was extraordinarily high. In fact, it peaked at about 0.62 prisoners per hundred per day. That's an extraordinary mortality rate. For those of you who are not mathematical demographers, let me give you some context on that. That's 90 to 540 times greater than normal adult mortality in Chad for men. It's worse than other prisoner contexts. For example, it's 1.3 to 4.1 times greater than the mortality rate of U.S. prisoners in Japanese custody during World War II. That's relevant because at the Tokyo Tribunal in 1948, after World War II, the treatment of U.S. prisoners by Japanese authorities was judged a war crime. It was also approximately the same rate greater than the mortality of German prisoners in Soviet custody during World War II. So this is very, very high mortality. And that image is a Senegalese newspaper drawing of me. I love that photo. That's my avatar all over the net now. The verdict in the case of the Extraordinary African Chambers cited this evidence in particular as a way of rebutting the defense claim that mortality in the prison was somehow normal, that it was somehow reasonable. And I want to tie this back to my original concern. What we're doing in human rights data analysis is trying to push back on grotesque lies. We're trying to push back on apologies for mass violence. And in fact, the judges in the case saw precisely that usage and cited our evidence for that purpose to reject President Habere's defense that conditions in the prison were nothing extraordinary. So that's a win. And I want to mark that as a win because human rights doesn't get very many. Human rights is a tiny community of activists and community members and the families of the victims and the survivors of mass atrocity pushing back on the full weight of governments. So we don't get many wins and when we do get one, we celebrate it. It's important. This is a former head of state who will spend the rest of his life in prison. So, yay. Now, let me give you another example. And here I want to look, this is a project that's very much in progress. Here we're looking for fosas clandestinas, hidden graves in Mexico. These hidden graves often contain the bodies of people who've been disappeared. People who've been kidnapped from their families, murdered, and their bodies have been hidden. And their families would very much like to know what happened to them. So let's model it as a machine learning problem. Where are those graves? And let's figure out how can we use some kind of ML model to predict where we are likely to find those graves. In order to guide search resources. To prioritize where it is that we are going to look for bodies. So there's 2455 municipios or counties in Mexico. And we know each year about 75 counties in which graves are found. Now, there are lots more graves than that. But you can imagine that if people in your community are regularly being abducted and murdered and never turn up, you might be reluctant to make a complaint to the local police who you might suspect of being involved in those disappearances. So we have a lot of disappearances, we have a lot of graves that go unreported. We have another 75 or so counties in which we can be sure because of good contacts in the local communities that there are not graves. These are peaceful places. Mexico's violence is highly heterogeneous. So it's distributed unevenly. So now we have positive cases, we have negative cases, and we have tons of independent information. In an ML context, these are features. To a statistician, these are covariates. So we have tons of information about counties. So let's just model it. Let's see. So we'll 50-50 split, randomly split the cases we know about into test and training sets. We'll train up a model, we'll predict the test data, and then we'll iterate that split-train test process a thousand times. And what we'll find is that over the course of four years that we've been looking at, more than a third of the time we can perfectly predict the counties that have graves. Now in recent years we have two ways of assessing the finding of graves. One is that graves are reported by family members or by the media. That's one way of identifying graves. A second way is that some prosecutors' offices, some local prosecutors, are looking for and finding graves and we're hearing about it. So we have two different models of that. And I'm going to emphasize that we are talking about two different models because now we can look at the interaction of those two different models and find out that there's different kinds of patterns of graves. Now this is a pretty complicated slide and this is a fairly short talk. What I want to emphasize is that just the stuff on the bottom, the white dots, are the counties in the test data for which we strongly believe do not have graves. And indeed both models are clear that the probability of finding a grave in those counties is low. The light blue circles are held out data from the prosecutors' offices and the red dots are held out data from the media and the families. And in both cases the model does a pretty good job of predicting that is trained on that kind of data does a pretty good job of predicting that kind of grave. But you can see there's actually different kinds of graves going on here. And I want to kind of highlight this, that machine learning models are really good at predicting things that are like the things they were trained on. Machine learning is really good at assuming that the future is like the past, conditional of course on all the features and the covariates. But when stuff gets really different there are cases in which for example the prosecutor's office finds a grave but the families and media kinds of graves, the kinds of graves that appeared in the families and media reports, not so good at reporting it. Now for the purposes of the application that we built these models for, the most interesting cases on this graph are the green dots up in the upper right. And the green dots represent counties in which neither reporting mechanism reported a grave but both models predict a high probability of finding graves. Those are places we should go search. And when we showed those counties to people who know a lot about violence in Mexico they were like, oh yeah, oh yeah, there's graves there for sure. So not surprising, why bother doing it? And the reason that it's worth doing even though it's not surprising is that this is an incredibly powerful advocacy tool. This is a way of demanding that state authorities in fact go and look for graves because the state authorities often reluctant to do that and they're reluctant to do that for a variety of reasons but the most important one is the people who go look may themselves be at risk. And if they can use a machine learning model out front they can say, hey look I'm just going with the model, it's a computer. I'm just dumb. And that's a very powerful tool. And we can come up with visualizations that distribute the probability of finding mass graves by county across Mexico and this can generate press attention and then help us in this advocacy campaign to bring state authorities to the search process. So that's helpful, that's useful, that's valuable, that's machine learning contributing positively to society. Does that mean that machine learning is necessarily positive for society? Yeah, no, no. And in fact many many many machine learning applications are terribly detrimental to human rights in society and that's going to be my final example. I want to talk about predictive policing. Predictive policing is catastrophic. Predictive policing is the use of machine learning to predict where crime is going to happen in the next iteration of the model. And it uses police records to learn patterns about crime? No, of course not. It learns patterns about police records. Now there's a big difference between police records and crime and we're going to talk about that through the course of this example. Using these patterns the computer will then make a prediction of course the model predict the most likely locations of future crime? No, the model is going to predict where crime will be detected in the future. So then in the application and the use of these models additional police resources are dispatched to the locations that the model's predicted in order to prevent crime or to do what? This is a big problem and let me explain what happens. So let's look at drug crimes in Oakland. I'm from San Francisco and so we looked at the Bay Area first. On the right this blue pattern shows you the heat map of the density of drug use in Oakland based on public health surveys. So this is completely outside the criminal justice process and as you can see the highest use of drugs is in the corner in the far north which is close to the University of California perhaps not all it's surprising university students tend to use a lot of drugs but for the most part drugs are ubiquitous in Oakland. I'm from the Bay Area not surprising to me. Drug arrests are not distributed evenly across Oakland. In fact they're concentrated in this corridor along this kind of western edge and for those of you know the Bay Area a little bit those are the flats that's International Avenue that's a primarily minority part of the city. So drug crimes are everywhere but drug arrests are not. What do we think the models are going to do? So what we did and by we here I mean my colleagues Christian Loom and William Isaac is that we re-implemented one of the most popular of the predictive policing algorithms. And we re-implemented it and then we started using it to predict crimes based on this data. And what I'm going to do is show you that model running in a little animation in the next slide. So here we go we're running the animation and the little dots in the grid are drug arrests and the data is being trained the model is being trained and right here bang we turn it on and we make predictions and hello the predictions are precisely in the same locations where the data was observed when the where the arrests were observed. Huh who would have guessed? Well anyone who knows anything about ML would have guessed that right? So what happens if the underlying data is biased? Well what we do of course is that we recycle that bias. Now bias data leads to bias predictions and by bias I do not mean necessarily biased in a racial sense that's in fact what happens here but biased in a specifically technical sense okay because bias in a technical sense means that we are over predicting one thing and we're under predicting something else. In fact what we're under predicting here is white crime. We're under predicting white crime. We're teaching the police dispatchers through this ML that they should go to the places that they went before which again what does ML do? It assumes that the future is like the past conditional on the covariates. So I'm going to skip the next slide because time is short go and say well okay that's great that's what happens when we just deploy the model and we just look at it but when police look at the model they react to it. They don't just you know look at it and say that's cool they actually deploy more resources based on that model. So what happens if more police go to the places that the model tells them to go? Well now we've got a feedback loop and what happens is that we'll do the same thing we're training again the model the little green dots are again training the model and then what happens is that when we turn the model on bang the same predictions but the consequence is that targeted policing overwhelmingly hits the minority neighborhoods. The neighborhoods that were already overpoliced become more overpoliced. So machine learning in this context does not simply recycle racial disparities and policing ML amplifies the racial disparities in policing and this is catastrophic. Policing already facing a crisis of legitimacy in the United States as a consequence of decades or some might argue centuries of unfair policing ML makes it worse. This is bad this is bad. So what I'd like to talk about is what's the difference between these two applications a human rights difference between these two applications finding hidden graves and policing. In grave predictions a false positive means that we waste some search resources we go look someplace where you know there probably aren't graves and a false negative means that we fail to search someplace we should have gone neither of these is actually worse than the status quo. We're already not searching for very many graves so this is a cost but it's not a terrible cost. In predictive policing a false positive means that a neighborhood can be systematically overpoliced contributing to the perception of the citizens in that neighborhood that they're being harassed and that erodes trust between the police and the community. Furthermore a false negative means that police may fail to respond quickly to real crime. And as I mentioned earlier if predictive policing reproduces prior patterns of police deployment police may perceive the model as accurate so we create a confirmation bias problem. Feedback then deepens existing biases in police deployment. So we're not when we call it predictive policing in my team we generally say actually we're predicting policing. What we're really doing is just predicting where the police are going to go. So look why are ML models wrong? All ML models are a little bit wrong right I mean if we don't get a little tiny bit of error we've overfit the model we've done a bad job that's just variance that's not a huge problem we can deal with a little bit of random error but bias is a bigger problem if we have data that is unrepresentative of the population to which we intend to apply the model the model is unlikely to be correct. It is likely to reproduce whatever that bias is in the input side. And in real world social data there's almost always a relationship between the observability of a phenomenon and the question we're trying to address. So can we observe crime we want to know where it is but the problem is we don't observe it all but our pattern of observation is systematically distorted. It's not that we simply under observe the crime but we under observe some crime at a much greater rate than other crime and in particular in the United States that tends to be distributed by race. And bias models then result and I want to make sure that none of us have the illusion that data visualizations or maps are anything other than statistical conclusions. They are entirely statistical conclusions subject to all the concerns and biases that I'm raising here. So what's the cost of being wrong and if there's anything you take away from my talk I really hope it's this. It's the reasoning about what happens if we build an ML model and we're wrong. Do we just waste some resources or do we affect people's lives? Do we destroy people's lives? That's critical. And more to the point who bears the cost of being wrong? Look, if we serve a customer and add for sneakers when they were looking for boots not such a big deal, right? First of all, a customer doesn't care. She just ignores it. And second of all, the person or the organization that bears the cost of that tiny error is probably the client. So we haven't foisted the cost of being wrong onto the customer. But if we make systematic errors in police deployment it's the neighborhoods and the communities that bear that cost. Not the machine learning vendor and probably not the police. So what we've done is we have a real problem in how the costs are distributed. It's very difficult to assess whether or not training data is correct and so this problem can be very hard to detect. I want to close with a story about how valuable it can be to get this right. This guy in the upper left, his name's Edgar Fernando Garcia. He was a Guatemalan student and labor activist who one day in February of 1984 left his home. Excuse me, left his office and he didn't turn up at home. Now, people who were activists in Guatemala in the early 1980s had a pretty good idea of what that might have meant when that happened. So his wife, Nanette, immediately, immediately that night started going to police stations around the city and she said, have you arrested my husband? Do you have my husband? She went to army bases. Have you arrested my husband? Do you have him? She went to embassies. She talked to international human rights groups. There were international campaigns pressuring the Guatemalan government to give them information about what had happened to Mr. Garcia. The Guatemalan government was like, no, no, I don't know where he is. Never seen him. I don't know. Maybe the left has killed him. As I said, grotesque lies. Well, in 2006 we discovered the archives of the National Police. 70 million pages of paper, four warehouses full of paper covered in rat feces, mold, dead insects. And we cleaned that all away. My colleagues and I conducted a rolling random sample, topologically sampling from this whole huge stack of paper in order to statistically characterize the documents. The historians reading this stuff found some of those documents and they found a police sweep in exactly the area of the city where Mr. Garcia was disappeared on the day he was disappeared. And they identified the officers who arrested him. Those two officers in the upper right here were brought to court where they said, I'm sorry, judge, we were just following orders. And the judge said, thank you very much. That is not a defense. The way we usually say that in the United States is they're just doing their job, but it's exactly the same phrase, you're guilty. You go to jail for 40 years. And today, please, Madam Prosecutor, if they were just following orders, could you go arrest their boss? That's this guy down here, okay? Hector Boldela Cruz, who was the director of the National Police at the time Mr. Garcia was disappeared. I was an expert witness in the trial against him. And in my evidence, I presented statistical analysis showing that the documents used by the historians were statistically very similar to the other documents in the archive. These were not documents that had been cherry picked, rather they were completely consistent with the normal bureaucratic functioning of the police. Like every bureaucracy, orders are generated through strategies which become plans, which become orders passing down the chain. People who are at the operational level receive the orders, they do what they're told, they go out, they come home, they write reports, the reports go back up the chain. The documents were precisely of that kind. And Colonel Boldela Cruz was convicted and sentenced to 40 years in prison. After that trial, the infant girl in her mother's arms there, she's now a grown human rights lawyer in Guatemala. And here she is embracing her grandmother, Mr. Garcia's mother. You can feel looking at that photograph, the relief that a family member feels when finally they learn to speak of their loved ones in the past tense. That's what it means to do good human rights work. And that's why it's so critical that we get it right and so critical that we avoid applications of our technology that make things worse. Thank you very much for your time. Thank you.