 Okay, so I'm going to talk about a new approach to early intervention systems, and the center that I work in, the Center for Data Science and Public Policy, as you heard, by the fact that I'm originally an astrophysicist, is an interdisciplinary institute that has people both from social science and from data and tech, which is my side of things. And the goal is to bring some of the methods that are commonly used by, like, Google to predict what ads you should see, and by Amazon to predict what books you might like, and to bring those more sophisticated data techniques into the public policy sphere. And so we've done a lot of work in the past on different kinds of early intervention systems. So one example is in the city of Chicago to predict which homes will have both lead paint and children in them, such that inspectors can be deployed to those places to prevent those children from being poisoned before they are harmed. So that kind of thing. We also have projects in education and most recently in police. And so we got involved through the White House Police Data Initiative, which has two parts, as many of you know. One is open data, and the other is developing these early intervention systems, or improving the early intervention systems. So to be clear, they are two separate programs. The data that is used to do the work that I'm talking about is not public. We have a data-sharing agreement with our partner, which is the Charlotte Mecklenburg Police Department. And the goal is to show kind of how police practices can be improved through the use of data. So when the data is cleaned and it's in a good condition, and there's incident-level data, then sort of more sophisticated techniques can be used. So these early intervention warning systems are to prevent these adverse interactions. And so we developed this definition of an adverse interaction in collaboration with CMPD, who told us what events they really want to prevent occurring. And so these are events where that can be triggered by a citizen or an officer complaining. And there are also non-complaint related incidents, such as the use of force. And all of those events go through the investigation process in CMPD. So when there's a use of force, that gets investigated, and IA decides whether or not that force was used with justification or whether that complaint is founded. And so we don't want to predict just spurious complaints. We want to predict the complaints where there really was some underlying issue. And so those are the events that we want to predict. So CMPD already had an early intervention system, as many agencies do. And it was based on a series of thresholds. So there's about 10 thresholds, and they sort of encapsulate officer intuition of what might be predictive of signs of trouble. And so these are things like having a certain number of complaints in 180 days. There's some smart stuff in here, like if an officer has had a lot of pursuits recently or there's been a lot of injuries, it makes sense that that might be predictive of future problems. So some of the issues with this approach is while it certainly was stated the art at the time, and it's a good version one, some problems are that a lot of officers tend to get flagged by these systems. And as you saw in the prior slide, for example, sick leave or vacations in this. And this means that a lot of officers are going to be flagged. And every time an officer is flagged, someone has to go in the system and say, okay, something going on and turn that off. And so 40% of officers being flagged is hundreds and hundreds of officers in a one-year time period. So that's one of the problems. Another problem is that a lot of officers that are going on to have some adverse interaction with the public aren't being flagged by the system for some reason. And so that's some of what we would like to prevent. Other issues are that they're inflexible. So you could imagine year-to-year conditions may change, and these thresholds may be adjusted or need to be adjusted in order to really find the people that you want to target interventions to. But we know that at least one vendor that we talk to is hard coding these sorts of thresholds and indicators directly into the system, which means that if you want to go in there and change it, it's really hard to do. Two other issues, so the second one is gaming, which is something that CNPD let us know about that one, something they're concerned with is that officers, they know about these thresholds, they know about the system, and they're concerned that people may be sort of gaming the system. They know that if they have too many complaints in this time period, that maybe they're holding back a little bit, and then they know they won't get flagged. And so they're sort of staying under the radar that way. And so there's that constant tension in all these kind of systems is the sort of tension between being transparent, which is extremely important for a system like this, and also the sort of the more transparent the system, the more gameable it is. So that's constantly an issue. Another sort of feature of these older systems, the threshold systems, is that they're just flagging, okay, there might be a problem here, or there's probably not a problem here. It's not a continuous risk score. So in an ideal world, you might want to have every officer in the department or every officer in a division, and you would want them prioritized by risk so that you could say, okay, the top 10%. Maybe I have them go for some additional training or some counseling. And so that's what this system that we developed is able to do. And the nice thing about that is if there's additional resources in that financial period for some training resources or counseling, then you can go further down in the list. So you can trade off between the true positives, the officers where there really is something, that is a problem, and the false positives. In any system, there's going to be false positives. Even if you can make this very, very accurate, it's always the case that it could be an error, and so there has to be some data due process there. So our approach is to use these data science methods that are commonly used in many other industries to predict future problems. And so we use historical data to build these predictive models that will say, okay, in the next one year, what is the probability that this officer is going to have an adverse incident with the public? And so the way that this works is we have human experts. So in this case, mostly police officers, so we've had two focus groups with police officers and CMPD. And the first was sort of to tell them about the project and to address the privacy and security concerns with the data, and both addressed officer intuition, like I'm not a police officer. So getting some idea of what do police officers believe might be predictive of future problems is really useful and being able to build that into the system. So the sort of approach that we use is we develop these seed indicators and then algorithms will sort through them to look through the data and find, okay, which of these indicators are indeed predictive of future problems? So the data that we use is, you know, one of the reasons we partner with CMPD is they have quite clean data. It goes back over 10 years and they have incident level data that is linked to individual officers. So we can look at an officer and we can see the arrests that they have had, their FIs that they're on and we can link this across all the different data sets. So the process of doing this, as you can imagine, was not trivial. We got all the data from the department, cleaned it all, put it into a centralized database and all of this was done with anonymized data. So at no point did we know the names of police officers. So when we came up with this risk score, we gave it back to the department and we just had officer hash ABF has a high risk score and then they could anonymize it on their side and to target interventions. So we use, again, all of this is closed data except we use some data from the US census. And then to build the system, we sort of go back in time since we have this nice history and we pretend like it's the end of 2009 and we use only the data that is available up until that day. We train this algorithm on these seed indicators and we see how well we do for the next year. And so then we can sort of step through time and see what the performance would have been if the system existed at that time. And we also have all of the historical flags from the existing year so we can compare, okay, how did we do compared to the year that they had? Did we do any better? And so the preliminary results from this is that we can reduce using the system false positives by about 30%. And in CMPD that corresponds to hundreds of police officers. So it's something that if deployed means that a lot of time is going to be saved. These are hundreds of times where a supervisor has to go in, look in the system, see if there's a problem, decide whether an intervention is necessary and if so, which one. And then we're also able to increase the accuracy so we can capture more of the officers where there really is a problem and some additional counseling or training is necessary. And just to emphasize again, any system like this is not, it's not a disciplinary system, it's not a punitive system and shouldn't be treated as such. It's really just to be able to find out where the risks are and be able to minimize them by making sure that those people have the right training and counseling. So the first question that people have is, okay, in this system, which of these seed indicators end up helping the predictive accuracy of the system? And so the first point is pretty obvious, but it's something that is not in all of the threshold systems, which are any feature that relates to prior history of adverse incidents. So if there have been sustained complaints in the past, if there's been the use of force without justification in the past and when we went into IA, there are even comments that say, okay, there's maybe some tactics issues here or communication concerns here. And as you can imagine, those things are very useful in predicting future problems. So what else that we found is that if there are a lot of stressful incidents that an officer is responding to, as you can imagine, that increases their risk to go on to have an adverse incident. So we found that if an officer is going to a lot of suicide events or domestic violence-related events, as well as events involving children, that increases their risk for having adverse incidents. And so in that situation, you might imagine that some counseling or some time not in that area may be appropriate. And in terms of devising the interventions, that's something that CMPD will be doing. And then we've also found that some trainings appear to significantly decrease risk. So we found that officers that had had less than lethal weapons training, which for CMPD was taser training, had a much lower risk. So there's sort of three ways that these kind of risk scores might be used. One is officer level, as I've been describing here. So we've shown that we can significantly improve the accuracy of some of these first-generation EIS. And the next step is to work with additional agencies, as well as to determine the most effective interventions. Another possible way that risk scoring like this could be used is on dispatch. So in this case, the problem would be to predict the likelihood that a dispatch will result in an adverse incident. And you could imagine that officer allocation could be done based on those risk scores. So that's another prediction problem that we've done some initial work on. And then you could do this at the group level and target interventions to groups of individuals. So I hope this has been useful in showing what you can do once you have a lot of data. And it seems like there's really a great deal that can be done when we have sort of clean and centralized data that can be used for a police oversight and monitoring. And if you're an agency that's interested in working on your developing an EIS, then you're welcome to contact me or my project manager. Thank you.