 Hi, I'm Rajiv, one of the co-founders of Public Data Works, or PDW. We are an engineering and design studio focused on building tools that make data useful to the public. Together, Sukhari and I started out embedded inside of the Invisible Institute, which is a journalism production company on the south side of Chicago, with a focus on police misconduct. That's where we really began our deep investigation into policing in the U.S. In 2015, we launched CPDP.co, or the Citizens Police Data Project. And since its launch, CPDP has become a hub for community organizers, academic research, investigative journalism, and civil rights litigation. In fact, CPDP's data formed the initial basis for a Justice Department consent decree in Chicago. Next, I'll pass it to Ayub to give a little introduction of our partner on this project, the Innocence Project New Orleans. Thank you. Hi, everyone. I'm Ayub Ibrahim, and I work for the Innocence Project New Orleans. The Innocence Project New Orleans is a law firm that specializes in freeing those who are on death row, so serving life sentences, but who are actually innocent. And they also work to free the unjust. Since 2001, Ipno has freed 43 people from life sentences and freed 16 people who are serving unjust sentences. We partnered with Public Data Works in November of 2019 to create the Louisiana Law Enforcement Accountability Database, which we'll touch on in a moment. Today, we're hoping to cover three themes across our work, but we're especially looking forward to the Q&A. So if there's anything that we don't get into enough detail about, please, please, please do reach out to us during the Q&A, or find us outside after the talk, of course. First, though, let me just give a little bit of contextual information. So this work is in the U.S., and in the U.S. there are public records laws that govern, at the state level, what information government agencies need to make available to the public when they request them. Unfortunately, they're not consistent across everywhere, so many government agencies are not actually very good at responding to public records requests. In particular, police departments are especially bad about it, and that's probably because they don't like to be held accountable for excessive use of force, and for all the complaints that they receive from the public, and then the investigations that they perform are sometimes more likely failed to perform about those complaints. Today, we're talking about LEED, or the Louisiana Law Enforcement Accountability Database, and we've worked together with IPNO on this project to build a statewide database of police misconduct records. These data and documents are brought together into one place where they can be easily searched and browsed. You can see officer profiles, and you can dig into the details of individual complaints on the documents when they're available. But we've also worked on a variety of investigations, and one of the things we've really focused on in the front page is a migratory map that shows the transitions that individual officers have taken as they've quit from one police department and joined another police department. This is a really problematic pattern that we've noticed in a lot of places where a police officer will sometimes commit an egregious crime, and then a complaint comes in, and while they're doing the investigation, that officer decides that they want to resign or quit, and then the investigation never happens. And as a result of that, they can leave and move to another city and get another job as a police officer again. Still with a badge and a gun, and the pattern unfortunately persists. So for that reason, one of the features we're really emphasizing on the front page is this what we call a migratory map. It's showing individual officers moving from one police department to another, and where it's available, we've got the information about the circumstances under which they made that migration. I'll also mention that when we were first approached, Sukari and I by IFNO to work on this project, we were really excited because for us, this is a big expansion of the work that we started in Chicago, where we were just working with one individual police department, the Chicago Police Department. With the Louisiana Project, we were working across an entire state, and there were hundreds of police departments and various agencies, each of which, as IU will tell you, has a different kind of data, different kind of document they provide to us, and a different process for getting that information in the first place. I'll just say more generally that because these are public servants, they should be accountable to the communities who they're supposed to be serving, and that's ultimately what LEED serves to seek to accomplish. It's to increase transparency by enabling access to public law enforcement data and to create a tool that can be used for accountability and justice. I'll pass it to Sukari to give a little bit more context on how we've approached the design process for this project. Thank you. Hi everyone. My name is Sukari, and I really focused on the design side of this project. As Rajiv mentioned, we originally started partnering with Ipno. They approached us with this idea, and we started out by building an internal tool for their wrongful conviction workflows. But we quickly began to expand the user base for LEED by building a coalition of local partners across the state of Louisiana who are best positioned to put this data to work. So that included public defenders, civil rights attorneys, community members, organizers and activists, and investigative journalists. And we really sought to build the tool around the needs and use cases of these different audience groups, really acknowledging that some users in this group had varying technical capacity and varying needs when it came to putting this data to use. So we really sought to go beyond simply publishing the raw data or CSVs in tabular form to really pull this data out of the raw documents and tables into scannable timelines and a really searchable data tool that's useful on mobile as well as on desktop. So we learned a lot when building our tool in Chicago, CPDP, another project that also really involved accommodating many different user types. But when transitioning from the city use case to a statewide use case, we really had to adjust the requirements and engage with a lot of the opportunities and challenges that come from gathering and consolidating and processing and publishing data at this larger scale. So as Rajeev and IU will touch on, we received data in varying formats and types and quality from the various departments that we received data from. So on the design side, it was really important for us to really think about how to accommodate and represent departments with data across a wide spectrum of completeness. So our design process involved a lot of sketching, wireframing, and iterative feedback processes with the users that we were building for, including being able to work closely with Ipno and wrongful conviction attorneys and investigators directly at Ipno to really build this around their use case as well. So we don't have time to give a really full, in-depth walkthrough of lead today, but I'm hoping if you are interested, you'll check it out on your own or reach out to us and we're happy to walk through it. There are a lot of different components, but I think we want to move on and focus a little bit more about on the data collection and data processing side. So I'll pass it back to you all. Cool. Thanks, Gary. So I'll briefly go through the data collection process. As you can imagine, it can get overwhelming with working with over 500 agencies across the state. When I started this project, we had received cuts. So in Louisiana, a public record is any document that's produced by a government agency. So there are some minor privacy limitations, but on the whole, if a government agency produces a document or a video or a data of any type, it is considered public record. One of the main hurdles with collecting data for this project was figuring out one, which government official or department we need to contact in order to submit a request for this data, and two, what type of data actually exists, what type of documents are they producing. Going into this project, we were only aware of a very limited amount of document types, which really focused on some of the documents that Ipno had collected in the past, which were useful to their wrongful conviction research, but because we wanted to design and create a database that was useful for the community as a whole or for different practitioners, we really needed to collect a vast amount of different data and document types. So that leads me to the two different data types that we were focusing on. If you go to LEAD right now, you'll see that most of the data that's actually in LEAD we received in the form of a table or a form, so it was structured. A lot of the smaller agencies just don't have the ability to transform some of these narrative type documents into structured information. And so one of the things that we're working on over the next year is trying to transform a lot of these unstructured data into structured information so that journalists, people, lawyers internally within Ipno and also within the large community, they can use these data that's in LEAD to efficiently understand how police misconduct is going within Louisiana. And that's a brief overview of the data collection side of things. If you have more questions, please feel free to reach out to me after and we can discuss further. I'll pass it to Rajiv next. I'm just going to give a quick overview of the actual pipeline that we use. This is not actually as comprehensive as I'd like it to be, but basically I'll start by the requirements. We have a lot of different types of data sets that are coming into this system. So they're coming in in a variety of different formats and they're coming in over time, so they need to be updated. And we need to be able to do that in a way that's non-destructive. So what happens is we will ingest new data. You usually will be responsible for producing the processing code for those data. Then it'll automatically run through a series of GitHub actions that is triggered by committing that new processing code. So we'll run the entire processing pipeline, create a preview table, do automatic validation on it, and then upload it into a data version control system. At that point we can preview it, and if the validation passes, then we'll up cert the updated records or the new records into the production database and refresh the search indexes. Now, lead is up to date. That was a really quick overview. There are some really important parts of this that I wanted to get into a bit more detail about, especially how we started handling the fact that police officers move from department department, they appear in different data sets, and when they're in each of those departments, we need to be able to match those together into a single profile so we can show that migration happens and create a consolidated profile for their history. But maybe we'll talk about that later. Or if you want, you can reach out to us after the presentation. We can talk more about it then. With that, this is a quick overview of what the data structure actually looks like. Also, please reach out to us later if you'd like to learn more about how that's set up. And I'm going to pass it over to IU again to talk more about impact. Awesome. Yeah, briefly touch on impact. So we launched lead in October of last year. Since then, journalists have been using lead to expose some of the police misconduct that happens specifically within the New Orleans area. So in 2011, the federal government stepped in and basically imposed restrictions on the New Orleans Police Department's ability to act. They introduced new policies that the department has to adopt. Over the past 10 years, there's been brief understanding about how the New Orleans Police Department is actually progressing on these goals. With lead for the first time, journalists have a sense of repository of data on the Louisiana Police Department to actually track the progress of the New Orleans Police Department. And so as you'll see, since early this year, the journalists have been using lead to determine whether or not the New Orleans Police Department is making progress on those goals that they've set out for themselves, which are really imposed on them by the federal government. And then can you please go to the next slide? One of the biggest projects in which lead has been used so far is called the Police Sexual Violence in New Orleans project. It was released by the Umbrella Coalition, which is a group of local community organizers in New Orleans who are focused on police reform. And what they found from 2014 through 2020, the New Orleans Police Department has aggressively ignored a lot of claims of sexual violence that were perpetrated by the New Orleans Police Department. And so I can send a link to this report later on, and we can discuss it in further detail. But for now, can you please go to the next slide? Great. And so we're also working on, well actually I'm going to pass a tease. Sure. So quickly I'll just mention we're working on three main investigations right now. You've just mentioned police sexual violence, which is a really important theme and one that we're working with a variety of collaborators on. And this work focuses on identifying individual officers and looking through their history to identify patterns where there's especially egregious patterns of police sexual violence. And that actually flows into our next topic, which is network investigations. We're looking at groups of officers who've been repeatedly co-accused together on misconduct complaints and who've often covered for each other or been supervisors for each other in various contexts and using that to identify, one, those groups of officers and two, the patterns that can help us to cede new investigations and identify other potential officers who are groups of other networks like that. Beyond networks and police sexual violence, the third major investigation for us is wrongful convictions. And this really naturally flows from our partners at the Innocence Project New Orleans. Their work is wrongful convictions. And one of the reasons why this project was so important to them is that there's the potential to connect individual officers who've been repeatedly involved or who are connected with wrongful convictions one way or another. And that often means framing people or planting evidence and in many cases resulting in people ending up on death row or being sentenced to death in Louisiana. So what we're helping them to do now is to identify some of the officers who have these records who have been involved in cases of wrongful conviction and trying to understand the patterns that are associated with them. So one, there can be more proactive prevention. And two, so that IPNO can be more effective at screening new wrongful convictions that come in and investigating them further. Beyond those three investigations, I'll mention that we're hoping to launch a new findings site or a new findings component of our site that covers both these investigations where we can make them available to the public and use them as a way to contextualize the data that we have. And that's going to be launching very soon, but we're still gathering all of the data for it. I'm guessing that means that we're done. Hello. Okay. Well, thanks everyone. And if you have any questions, we're here. Also, we're around over the next day or two. So please, please, please do reach out. Thank you all for that great talk. We do have time for questions. I think we'll have a lot. All right. Yeah. Thank you. First of all, great project. And of course, I'm not surprised. I'm a big fan of Citizens Police Data Project. I'm from Albuquerque, New Mexico, which is one of the deadliest police forces in the United States right now. And we've been doing some work around a similar type of topic. And for this kind of migratory issue, which is a similar problem in our area, we have had a big problem with the heterogeneous nature of the data between different departments with different jurisdictions. Do you have any suggestions or recommendations that you've learned of policy changes that can help make this a bit easier? Because we've had this huge blocker in my community as well. Yeah. That's a really good point around policy changes in particular. I can't say that we've had enormous success there. But one thing we've leaned on quite heavily is state-level data that's gathered by what's usually called POST or POST in each state, which is an organization that's responsible for certifying individual officers and unfortunately, not very consistently, they're also responsible for de-certifying officers. I wish that information was more reliable, but it's definitely been a useful tool for us because it is something that operates at the state level. In terms of legislation, I guess you could say that in Louisiana, the Consent Degree for New Orleans had some impact because it forced the New Orleans Police Department to be a lot more proactive. And that sort of sets an example for other police departments. But I can't say that we've actually been involved in any legislation that's had meaningful impact on improving consistency. I'll just say it's a lot of work, not unfortunately work that I'm able to contribute to, but something that IU works on a lot. In terms of... Normalizing data sets. Oh, sure. Well, I mean, as Rajeev said, I think the most important part is to get a reference point. So at the state level, we have the POST agency which you represent. And that serves as our reference point for which we can connect a lot of these data sets. So if you can get a golden standard, we treat it as our golden standard. If you can find a golden standard within your state, the same agency likely exists in New Mexico, I would just go from there. Could you put the links on both your organizations just for future reference? Yeah. That's a great idea. I'm going to come back to that and ask, how can people help you or how can they contact you? Reach out to us. We're going to add a slide in a second that's got our website and email address. The simple answer is pdw.co. And from there, you'll be able to link out to everything else. And there's a contact button there that you can use to email us and our team, and we'll get back to you with, I'm sure, a very eager response so that we can follow up and have a further conversation soon. And thanks so much for the excellent question. You mentioned the impact in the community. I was wondering how has your effort been received by the police departments? Is this something desirable or do they see this as too exposing or too revealing of their own internal? I'll speak to our experience in Chicago where the police department has been litigious about it. They've made many loud complaints and eerily said that their union in particular is going to sue us or do other not nice things. But ultimately, those have been threats and we've been able to work together with really strong legal partners who cover us and who ultimately are able to make sure that everything that we do is using public information. So it's very unambiguous that all the records that we make available to the public are public records and if they have a problem with that, then that's really on their government agencies that produce those documents in the first place or who fail to maintain them accurately which is a legitimate problem in some cases. Do you have any comments about feedback in New Orleans or Louisiana? I think it's a bit too early for us to say because we just launched in October but for now there's been no reception which is a great thing. I will also mention we have a live chat on CPDP in particular where people will click on it and send messages, sometimes they come straight to my phone and some people are really grateful. There have been some police officers who say I genuinely want you to correct this mistake or you didn't talk enough about how many awards I got and then also a lot of individuals and then there are police officers who are very angry and just say I declare that you are not allowed to do this so stop. And then also there are members of the public who reach out to us and we have to clarify that we're not a government agency and then we usually will point them towards the civilian oversight agency in Chicago that's responsible for taking complaints but a lot of people come to the site just by searching for an officer's name and this is one of the first sites that come up and they're able to see that other people have had complaints other people have filed complaints and that's a really validating experience and it can really trigger the next steps of thinking that it's worth pursuing a complaint further. I've got a question but I want to give y'all a chance. Hi, thank you for your presentation. Do you plan on keeping this going on making freedom of information request or what is the next step? Yeah, that's an excellent question and something that's on our mind a lot all the time honestly. Obviously, as we mentioned earlier one of the next steps for us is going deeper so doing more investigations into patterns of the information that we've received and also extracting more information that we couldn't receive as tables is buried inside of documents so that means a pretty deep investigation on our part into some of the methods that we can use for extracting structured information out of raw text or free form text so that's a really challenging thing that we've been able to make a lot of progress on and that we're continuing to work really hard at. So definitely going deeper is one thing. I think your question was also asking a bit about updates maybe and that's definitely been a challenge. It's one that I think we've gotten a lot further on for Louisiana than we have in Chicago to be honest. In Chicago there's just been a lot of back and forth where the civilian oversight agencies changed their data systems over and over again which has made data processing very, very challenging and also we want to make sure we have consistency between the old data and the new data so I won't comment too much about Chicago right now but in Louisiana we've started this project from the beginning with a really clear idea that no matter how the data comes to us we want to make sure we can keep it up to date and we've put a lot of thought into how we've built the processing scripts so that we're able to do it in a more recurring basis more regularly recurring basis if we're able to receive data that comes into us in a similar format and if we don't we've built a framework that lets us load in different individual data sets and merge it all together in a way that's pretty efficient non-destructive and de-duplicated. So it's a hard problem, it's a very hard problem I'm glad you asked about it and in Louisiana in particular because that project started from the beginning with a variety of data sets from a variety of different sources I think we've gotten a better start into a framework that lets us do that efficiently. I'm going to ask a question are you hosting this data publicly somewhere where other people can come in and like build their own tools or analysis with it? Yeah I'm so glad you asked and I regret not having mentioned this earlier yes all of the raw underlying data including the source material in almost 100% of the cases is available publicly there are GitHub repos for both projects the Chicago project and for LEED the LEED GitHub repo is a lot more advanced a lot more comprehensive and you will be able to find links at LEED.co and if you email us at pdw.co but yeah there are public GitHub repos that has the entire data processing pipeline it's really important to us to be transparent about that because that's part of how we can be auditable and that's a really important aspect of this project is making sure that people trust the information that we're making available so raw source material it comes from the government the original request that helped us to get it all the way through to actually the output data that's available on the website. Awesome thank you so much can we get another applause?