 I introduce Whitney Merrill, she is an attorney in the U.S. and she just recently, actually last week, graduated to her CS Master's in Illinois. Without further ado, Prittin Crime in a big data world. Hi, everyone. Thank you so much for coming. I know it's been an exhausting Congress, so I appreciate you guys coming to hear me talk about big data and crime prediction. This is kind of a hobby of mine. I, in my last semester at Illinois, decided to poke around what's currently happening, how these algorithms are being used, and kind of figure out what kind of information can be gathered. So I have about 30 minutes with you guys, I'm going to do a broad overview of the types of programs. I'm going to talk about what predictive policing is, the data used, similar systems in other areas where predictive algorithms are trying to better society. What uses in policing, I'm going to talk a little bit about their effectiveness and then give you some final thoughts. So imagine, in the very near future, a police officer is walking down the street wearing a camera on her collar. In her ear is a feed of information about the people and cars she passes, alerting her to individuals and cars that might fit a particular crime or profile for a criminal. Early in the day, she examined a map highlighting hotspots for crime. In the area she's been set to patrol, the predictive policing software indicates that there's an 82% chance of burglary at 2pm and it's currently 2.10pm. As she passes one individual, her camera captures the individual's face, runs it through a coordinated police database. All of the police departments that use this database are sharing information, facial recognition software indicates that the person is Bobby Burglar, who was previously convicted of burglary, was recently released and is now currently on patrol. The voice in her ear whispers, 50% likely to commit a crime. Can she stop and search him? Should she chat him up? See how he acts? Does she need additional information to stop and detain him? And does it matter that he's carrying a large duffel bag? Did the algorithm take this into account or did it just look at his face? What information was being collected at the time the algorithm chose to say 50% to provide the final analysis? So another thought I'm going to have you guys think about as I go through this presentation is this quote that is more favorable towards police algorithms, which is, as people become data plots and probability scores, law enforcement officials and politicians alike can point and say technology is void of racist profiling bias of humans. Is that true? Well, they probably will point and say that, but is it actually void of racist profiling of humans? And I'm going to talk about that as well. So predictive policing explained. Who and what? First of all, predictive policing actually isn't new. All we're doing is adding technology, doing better, faster aggregation of data. Analysts in police departments have been doing this by hand for decades. These techniques are used to create profiles that accurately match likely offenders with specific past crimes. So there's individual targeting and then we have location-based targeting. The location-based, the goal is to help police forces deploy their resources in a correct manner, in an efficient manner. They can be as simple as recommending that general crime may happen in a particular area or specifically what type of crime will happen in a one-block radius. They take into account the time of day, the recent data collected, and when in the year it's happening as well as weather, et cetera. So another really quick thing worth going over, because not everyone is familiar with machine learning, this is a very basic breakdown of training an algorithm on a data set. You collect it from many different sources, you put it all together, you clean it up, you split it into three sets, a training set, a validation set, and a test set. The training set is what is going to develop the rules in which it's going to kind of determine the final outcome, you're going to use a validation set to optimize it, and finally, apply this to establish a confidence level. You'll set a support level where you'll say you need a certain amount of data to determine whether or not the algorithm has enough information to kind of make a prediction. So rules with a low support level are less likely to be statistically significant. And the confidence level in the end is basically if there's an 85% confidence level, that means there's an 85% chance that the suspect, for example, meeting the rule in question, is engaged in criminal conduct. So what does this mean? Well, it encourages collection and hoarding of data about crimes and individuals, because you want as much information as possible so that you can detect even the less likely scenarios. Information sharing is also encouraged because it's easier, it's done by third parties or even what are called fourth parties, and shared amongst departments. And here, you know, criminal data, again, was being done by analysts and police departments for decades, but the information sharing and the amount of information they could aggregate was just significantly more difficult. So what are these predictive policing algorithms and software, what are they doing? Are they determining guilt and innocence? And unlike a thought crime, they're not saying this person is guilty, this person is innocent. It's creating a probability of whether or not the person has likely committed a crime or will likely commit a crime. And it can only say something to the future and the past. This here is a picture from one particular piece of software provided by HunchLab. And patterns emerge here from past crimes that can profile criminal types and associations, detect crime patterns, et cetera. Generally in these types of algorithms, they're using unsupervised data that means someone's not going through and saying true, false, good, bad, good, bad. There's just one, too much information. And two, they're trying to do clustering, determine the things that are similar. So really quickly, I'm also going to talk about the data that's used. There are several different types, personal characteristics, demographic information, activities of individuals, scientific data, et cetera. This comes from all sorts of sources. One that really shocked me was, and I'll talk about it a little bit in the future, what is the radiation detectors on New York City police are constantly taking in data, and it is so sensitive it can detect if you've had a recent medical treatment that involves radiation. Facial recognition and biometrics are clear here. And the third party doctrine, which basically says in the United States that you have no reasonable expectation of privacy and data you share with third parties, facilitates easy collection for police officers and government officials because they can go and ask for the information without any sort of warrant. For a really great overview, a friend of mine, Dia, did a talk here at CCC on the architecture of street level panopticon. Does a really great overview of how this type of data is collected on the streets, worth checking out because I'm going to gloss over kind of the types of data. There is, in the United States, what they call a multi-state anti-terrorism information exchange program, which uses everything from credit history, your concealed weapons permits, aircraft pilot licenses, fishing licenses, et cetera, that's searchable and shared amongst police departments and government officials, and this is just more information. So if they can collect it, they will aggregate it into a database. So what are the current uses? There are many, many different companies currently making software and marketing it to police departments. All of them are slightly different, have different features, but currently it's a competition to get clients, police departments, et cetera. The more police departments you have, the more data sharing you can sell saying, oh, by enrolling you'll now have X, Y, and Z police departments data to access, et cetera. These here are Hitachi and Hunch Lab. They both are hotspot targeting. It's not individual targeting. Those are a lot rarer, and it's actually being used in my hometown, which I'll talk about in a little bit. Here the appropriate tactics are automatically displayed for officers when they're entering mission area. So Hunch Lab will tell an officer, hey, you're entering an area where there's going to be burglary that you should keep an eye out, be aware. And this is updating in live time, and they're hoping it mitigates crime. Here are two other ones. The domain awareness system was created in New York City after 9-11 in conjunction with Microsoft. New York City actually makes money selling it to other cities to use this. CCTV cameras are collected. They can, if they say there's a man wearing a red shirt, the software will look for people wearing red shirts and alert police departments to people that meet this description walking in public in New York City. The other one is by IBM, and there are quite a few, you know, it's just generally another hotspot targeting. Each have a few different features. What I'm mentioning too is the heat list. This targeted individuals, I'm from the city of Chicago, I grew up in the city. There are currently 420 names when this came out about a year ago of individuals who are 500 times more likely than average to be involved in violence. Individual names passed around to each police officer in Chicago. They considered the rap sheet, disturbance calls, social network, et cetera. One of the main things they considered in placing mainly young black individuals on this list were known acquaintances and their arrest histories. So if kids went to school or young teenagers went to school with several people in a gang and that individual may not even be involved in a gang, they're more likely to appear on the list. The list has been heavily criticized for being racist, for not giving these children or young individuals on the list a chance to change their history because it's being decided for them. They're being told you are likely to be a criminal and we're going to watch you. Officers in Chicago visited these individuals would do knock and announce where they knock on the door and say, hi, I'm here, like just checking up what are you up to, which you don't need any special suspicion to do, but it's, you know, kind of a harassment that might cause a feedback back into the data collected. This is pre-cobb. It's currently used here in Hamburg. They actually went to Chicago and visited the Chicago Police Department to learn about predictive policing tactics in Chicago to implement it throughout Germany, Hamburg and Berlin. It's used to generally forecast repeat offenses. Again, when training data sets, you need enough data points to predict crime. So crimes that are less likely to happen or happen very rarely, much harder to predict. Crimes that aren't reported, much harder to predict. So a lot of these software, like pieces of software rely on algorithms that are hoping that there is a same sort of picture that they can predict where and when and what type of crime will happen. Pre-cog is actually a term with a play on the movie Minority Report. If you're familiar with it, it's the three psychics who predict crimes before they happen. So there are other similar systems in the world that are being used to predict whether or not something will happen. The first one is disease and diagnosis. They found that algorithms are actually more likely than doctors to predict what disease an individual has. It's kind of shocking. The other is security clearance in the United States. It allows access to classified documents. There's no automatic access in the U.S. So every person who wants to see some sort of secret, cleared document must go through this process. And it's vetting individuals. So it's an opt-in process. But here they're trying to predict who will disclose information, who will break the clearance system and predict there. Here the error rate, they're probably much more comfortable with a high error rate because they have so many people competing for a particular job to get clearance that if they're wrong that somebody probably won't disclose information, they don't care. They just rather eliminate them than take the risk. So I'm an attorney in the United States. So I have this urge to talk about U.S. law. It also seems to impact a lot of people internationally. Here we're talking about the targeting of individuals, not hot spots. So targeting of individuals is not as widespread currently. However, it's happening in Chicago and other cities are considering implementing programs and there are grants right now to encourage police departments to figure out target lists. So in the United States, suspicion is based on the totality of the circumstances. That's the whole picture. The police officer, the individual must look at the whole picture of what's happening before they can detain an individual. It's supposed to be a balanced assessment of relative weights, meaning, you know, if you know that the person is a pastor, maybe them pacing in front of a liquor store is not as suspicious as somebody who's been convicted of three burglaries. It has to be based on specific and articulable facts. And the police officers can use experience and common sense to determine whether or not there's suspicion. Large amounts of networked data generally can provide individualized suspicion. The principal components here are, you know, the events leading up to the stop and search, what is the person doing right before they're detained, as well as the use of historical facts known about that individual, the crime, the area in which it's happening, etc. So it can rely on both things. No court in the United States has really put out a percentage as what probable cause and reasonable suspicion. So probable cause you need to get a warrant to search and seize an individual. Reasonable suspicion is needed to do stop and frisk in the United States, stop and individual and question them. And this is a little bit different than what they call consensual encounters where a police officer goes up to you and chats you up. Reasonable suspicion, you're actually detained. But I had a law professor who basically said 30, 45% seemed like a really good number just to show how low it really is. You don't even need to be 50% sure that somebody has committed a crime. So officers can draw from their own experience to determine probable cause. And the UK has a similar reasonable suspicion standard, which depend on the circumstances of each case. So I'm not as familiar with UK law, but I believe some of the analysis around reasonable suspicion is similar. Is this like a black box? So I threw this slide in for those who are interested in comparing this to US law. Generally a dog sniff in the United States falls under a particular set of legal history which is a dog can go up, sniff for dogs, alert, and that is completely okay. And the police officers can use that data to detain and further search an individual. So is an algorithm similar to the dog, which is kind of a black box? Information goes out. It's processed. Information comes out and a prediction is made. Police rely on the good faith and totality of the circumstances to make their decisions. So there's really no, if they're relying on the algorithm and think in that situation that everything's okay, we might reach a level of reasonable suspicion where the individual can now pat down the person he's decided on the street or the algorithm has alerted to. So the big question is could the officer consult predictive software absent any individual analysis? Could he say 60% likely to commit a crime? In my hypo, does that mean that the person without looking at anything else detain that individual? And the answer is probably not. Unpredictive policing algorithms just cannot take in the totality of the circumstances. They have to be frequently updated. There are things that are happening that the algorithm possibly could not have taken into account. The problem here is that the algorithm itself, the prediction itself, becomes part of totality of the circumstances, which I'm going to talk about a little bit more later. But officers have to have reasonable suspicion before the stop occurs. Proactive justification is not sufficient. So the algorithm can't just say 60% likely. You detain the individual and then figure out why you've detained the person. It has to be before the detention actually happens. And the suspicion must relate to current criminal activity. The person must be doing something to indicate criminal activity. Just the fact that an algorithm says based on these facts 60% or even without articulating why the algorithm has chosen that isn't enough. Maybe you can see a gun-shaped bulge in the pocket, et cetera. So effectiveness, the totality of the circumstances can the algorithms keep up generally probably not missing data and not capable of processing this data in real time. There's no idea the algorithm doesn't know and the police officer probably doesn't know all of the facts. So the police officer can take the algorithm into consideration, but the problem here is did the algorithm know that the individual was active in the community or was a politician or that was a personal friend of the officer, et cetera? It can't just be relied upon. What if the algorithm did take into account that the individual was a pastor? Now that information is counted twice and the balancing for the totality of the circumstances is off. Humans here must be the final decider. What are the problems? Well, there's bad underlying data. There's no transparency into what kind of data is being used, how it was collected, how old it is, how often it's been updated, whether or not it's been verified. There could just be noise in the training data. And honestly, the data is biased. It was collected by individuals in the United States. Generally, there have been several studies done that black young individuals are stopped more often than whites. And this is going to cause a collection bias. It's going to be drastically disproportionate to the makeup of the population of cities. And as more data has been collected on minorities, refugees, and poorer neighborhoods, it's going to feedback in and of course only have data on those groups and provide feedback and say more crime is likely to happen because that's where the data was collected. So what's an acceptable error rate? Well, it depends on the burden of proof. Harm is different for an opt-in system. What's my harm if I don't get clearance? Well, I don't get the job, but I'm opting in. I'm asking to be considered for employment. In the U.S., what's an error? If you search and find nothing, if you think you have reasonable suspicion based on good faith, both on the algorithm and what you witness, the United States says there's no Fourth Amendment violation even if nothing has happened. It's very low error, false positive rate here in big data generally and machine learning is great. Like, 1% error is fantastic, but that's pretty large for the number of individuals stopped each day or who might be subject to these algorithms. Because even though there are only 400 individuals on the list in Chicago, those individuals have been listed basically as targets by the Chicago Police Department. Other problems include database errors. Exclusion of evidence in the United States only happens when there's gross negligence or systematic misconduct. That's very difficult to prove, especially when a lot of people view these algorithms as a big box. Data goes in, predictions come out, everyone's happy. You rely and trust on the quality of IBM, Hunch Labs, et cetera, to provide good software. Finally, some more concerns I have include feedback loop, auditing and access to data and algorithms and the prediction thresholds. How certain must the prediction be before it's reported to the police that the person might commit a crime or that crime might happen in the individual area? If reasonable suspicion is as low as 35% and reasonable suspicion in the United States has been held as that guy drives a car that drug dealers like to drive and he's in the DEA database as a possible drug dealer, that was enough to stop and search him. So are there positives? Well, Predpole, which is one of the services that provides predictive policing software, says since these cities have implemented there's been drop in crime in LA, 13% reduction in crime in one division. There was even one day where they had no crime reported. Santa Cruz, 25 to 29% reduction, 9% in assaults, et cetera. One, these are police departments self-reporting these successes for, you know, take it for what it is and reiterated by the people selling the software, but perhaps it is actually reducing crime. Well, it's kind of hard to tell because there's a feedback loop. Do we know that crime is really being reduced? Will it affect the data that is collected in the future? It's really hard to know. Because if you send police officers into a community, it's more likely that they're going to affect that community and that data collection. Will more crimes happen because they feel like the police are harassing them? It's very likely and it's a problem here. So some final thoughts, predictive policing programs are not going anywhere. They're only in their real start. And I think that more analysis, more transparency, more access to data needs to happen around these algorithms. There needs to be regulation, currently a very successful way in which these companies get data is they buy it from third-party sources and then sell it to police departments. So perhaps PredPol might get information from Google, Facebook, social media accounts, aggregate data themselves, and then turn around and sell it to police departments or provide access to police departments. And generally the courts are going to have to begin to work out how to handle this type of data. There's not case law at least in the United States that really knows how to handle predictive algorithms in determining what the analysis is. And so there really needs to be a lot more research and thought put into this. And one of the big things in order for this to actually be useful, right? If this is a tactic that's been used by police departments for decades, we need to eliminate the bias in the data sets because right now all it's doing is facilitating and continuing bias set in the database. And it's incredibly difficult. It's data collected by humans and it causes initial selection bias, which is going to have to stop for it to be successful. And perhaps these systems can cause implicit bias or confirmation bias. For example, police are going to believe what they've been told. So if a police officer goes on duty to an area and an algorithm says you're 70% likely to find a burglar in this area, are they going to find a burglar because they've been told? You might find a burglar. And finally, the U.S. border. There is no Fourth Amendment protection at the U.S. border for it's an exception to the warrant requirement. This means no suspicion is needed to commit a search. So this data is going to go into a way to examine you when you cross the border. And aggregate data can be used to refuse you entry into the United States, et cetera. And I think that's pretty much it. And so I have a few minutes for questions. Thank you. Yeah. Thanks a lot for your talk. We have about four minutes left for questions. So please line up at the microphones and remember to make short and easy questions. Microphone number two, please. Okay. Just a comment. If I want to run a crime organization like, I would target the pre-cops here in Hamburg maybe. So I can take the crime to the scenes where the pre-cops doesn't suspect. Possibly. And I think this is a big problem in getting availability of data in that there's a good argument for police departments to say, we don't want to tell you what our tactics are for policing because it might move crime. Do we have questions from the Internet? Yes. Then please, one question from the Internet. Is there evidence that data like the use of encrypted messaging system, encrypted emails, VPN, Tor, with automated request to the ISP, are used to obtain real names and collected to contribute to the scoring? I'm not sure if that's being taken into account by predictive policing algorithms or by the software being used. I know that police departments do take those things into consideration and considering that in the United States, totality of the circumstances is how you evaluate suspicion, they are going to take all of those things into account and they actually kind of have to take that into account. Okay. Microphone number one, please. In your example, you mentioned disease tracking, for example, Google flu trends. A good example of predictive predictive policing, are there any examples where instead of increasing policing in the lives of communities where sociologists or social workers are called to use predictive tools instead of more criminalization? I'm not aware if police departments are sending social workers instead of police officers, but that wouldn't surprise me because algorithms are being used to suspect child abuse and in the U.S., they're going to send a social worker in regard. So I would not be surprised if that's also being considered, since that's part of the resources. Okay. So if you have a really short question, then microphone number two, please. Last question. Okay. Thank you for the talk. This talk, as well as a few others, brought the thought and the debate about the fine tuning that is required between false positives and preventing crimes or terror. Now, it's a different situation if a policeman is predicting, or the system is predicting somebody stealing a paper from someone, or someone is committing a terror attack and the justification to prevent it on the expense of false positives is different in these cases. How do we make sure that the decision of a fine tuning is not going to be deep down in the algorithm and by the programmers but rather by the customer, the policeman or the authorities? I can imagine the police officers are using common sense in that and their knowledge about the situation and even what they're being told by the algorithm. You hope that they're going to take, they probably are going to take terrorism to a different level than a common burglary or a stealing of a piece of paper or nonviolent crime. And that fine tuning is probably on a police department by police department basis. Thank you. This was Whitney Merrill. Thank you. Give a warm-up of applause, please. Thank you.