 Good afternoon to everyone there. I'm, my understanding is I've got about 10 minutes here. So I'm going to, this is a large, very large project that could consume far more than 10 minutes. So I'm just going to touch on some, some key points here of the research and then very, very happy to discuss it further later with anyone interested. So this is a project that sort of takes a step back and says, and sort of asks, are there particular barriers to innovation innovations and peacekeeping in studying political violence more generally, and ask the question, in what ways, how do we understand patterns of violent unrest around the world. And one way that we do that is through the compilation of news reports and so that's really what this project seeks to understand the effects of particular news reporting patterns on how we then understand global unrest, whether that's political violence, protest right and so on and so forth. This is a large project with a number of co-authors, including Michael Weintraub who may be in the audience or at least is there with you all in Helsinki for the conference. So, right, the starting point of the research is the same reason we're all assembled, right, which is we care and we're concerned about political violence, understanding its causes and its consequences and how to mitigate it. And one way we've done that as I mentioned is to take news reports, we sort of the broader academic and sort of NGO community to take incidents of violence reported in the often in the news, not exclusively, but often in the news. And then to compile those into very, very large data sets that we can then use as quantitative scholars to then try to understand patterns of again of political violence whether comparing across countries or within countries. So, a variety of efforts to do this. The GED data set, the geo-referenced event data set out of the Solar Conflict Data Program, PREO, has one such data set, the Global Terrorism data set out of the University of Maryland, the ACLID, the Armed Conflict Location Event Data Project, the Social Conflict Analysis data set which previously was known as the Social Conflict in Africa data sets, IQs, the Integrated Crisis Early Warning System, just to name a few. And again, these don't rely entirely on news reports, but the preponderance of the material at least for many country cases does come from news reports. And these, these, these type of data, these news report based data are used to study a wide variety of political violence. So I don't need to rattle these off but everything from terrorist activity to how we understand electoral violence to violence against peacekeeping forces, often it's news media that is sort of the source, the primary source of our understanding of political violence, and who uses these data, many of us in academia use these data, and have used them to publish a number of leading economic political science and other journals. But it doesn't stop there, right? Government organizations, a variety of US and other government organizations use these and not only use them but indeed fund them. So a lot of funding for some of these organizations comes from different government partners, and a number of think tanks NGOs, international organizations use them as well. And there are implications beyond the academic inferences that we derive from these data but I think implications for government and other users of them as well as, as, as members of the policy community attempt to craft and programming community attempt to craft solutions, again to to political violence. And so it raises this really fundamental question which is, do media reports in conflict or other forms of unrest when aggregated. Do they provide a generally accurate picture of political violence, what do we mean by accuracy what we could mean the place is the place generally accurately depicted the timing the motivation the target the weaponry employed the group of implicated or affected the outcome right we can think about this across a wide variety of dimensions. And I'm going to just start with this quote from a current staff writer at the New York Times who interviewed for the project. And it's a great who who basically said to us like, there's a lot of violence that happens, and it can't all be written about nor nor should have all be written about. I'm paraphrasing. We are not just a chronicle of all violent events that take place over the course of a day right so the starting point here is that if you talk to the news media. There's sort of general responses like no we were not a chronicle of all events and so if you if you aggregate them then that might raise some questions about like whether certain types of events are being systematically missed. So what are the findings of our project in brief, we're going to estimate that the media based data sets, at least those that we study this project significantly and systematically under report particular type types of violent events. So when we use them and what we call a reverse replication exercise, basically asking whether a number of existing academic studies can be can be recovered in terms of the results they find using media based data we find that in the majority of cases they cannot. Clearly there are, I think a number of reasons to be concerned about cross country comparisons using media based data. And this quote from one of our interviewees I think really kind of really kind of brings this home. Right. So this is an individual working for one of the leading wire services in the world, talking about working on Yemen for 18 months and during that period never being able to get a visa to even go to the country in the first place. And this was during the Saudi intervention when a lot of important when a lot of violence was taking place. And he then contrast that with Ukraine, right, where he says there are probably 3000 foreign journalists there now, I could go tomorrow get on the flight to Poland crossover the border relatively easily. So you can, you right away we can see that just like the restrictive nature of government policies leads to situations in which some countries are going to get a lot of reporting some or not. But what about within country comparisons right if we say well let's let's hold the country fixed let's only look at violence and in an error trade let's only look at violence in the Democratic Republic of Congo let's only look at violence in Afghanistan. Do these do do these data sets do better. And here another this one former New York Times reporter, I think gave a quote that really kind of illustrates the concern here. And he basically said to us if one Iraqi work he worked in Baghdad, and he talked about how if one Iraqi were killed, it wasn't a story. If five were killed it wasn't a story 10 were killed maybe it was a story. If 20 or more were killed that was a story. In contrast, if a single American soldier was killed that was a story right so right away we see what my coauthors and I described as an editorial bias the, the tendency for news media to simply decide that certain stories are just more covered than others. Right. And so if certain stories are being systematically omitted then then that would be the source of bias right. So that's what we call in the class of editorial biases we come up with a number of these, which I'm happy to go through later. The other way we think about this as a capability bias, simply not all violence can be covered because journalists face a number of restrictions on the reporting. So for example talked about certain places in Columbia where their teams simply can't go because it was too unsafe to cover violence and other issues to go back to the previous interviewee who talked about Mosul in particular really being unreachable during the rock war and so you could imagine a limit or a lack of coverage on on that part of the country for lack of safety so on and so forth. And so how are we going to carry out our study. What we're going to do is we're going to take the media based data sets we're going to compare them to different administrative data collected by different government forces in different campaigns. We're going to use data supplied by the US Defense Department for the year basically 2016 and early 2017 on its air strikes against ISIS targets in Iraq and Syria. We're going to get violence recorded by US and partner forces in Afghanistan, during the US invasion of Afghanistan, similar for Iraq, and then we're going to use data from the Philippines released by the Philippines military and police forces, roughly 25,000 observations of data, and then finally we're going to use some data released on protest and riot activity from the South African police service. Basically, the reason we're going to do this is we're going to, we first note that there are many ongoing collection efforts that really that are incredibly detailed. And so here we can see, for example, just from Sygax data from Afghanistan how detailed the data are from the Helmand region, a very violent area of Afghanistan during the war, so violent. And so detailed, you can actually see the, the outline of the road network here even though there's no roadmap superimposed in the background here. So these are incredibly detailed data, but why trust them right why trust the records. And, and, and, and the trust comes from a couple sources first in the case of the data from Iraq Afghanistan the Philippines is, and then Iraq and Syria for for the campaigning of sizes. These were detail, these were data constructed by military following very specific military protocol using advanced technologies that allowed them to basically get a high degree of spatial and temporal accuracy. Second, right there collected with standard operating procedures that raise the confidence that they were reported, sort of regardless of when and where they took place. The big part of this is that they were released that these data were generated for internal consumption. So the data was released by the US Defense Department for example was previously classified data, meaning that it was, it was it was constructed in such a way that it was it was purposely not released to the public. And then it was only through the declassification process that we're able to get these data. And one last point I'll make is with respect to the campaign and sizes. Each of these bombings right that we see in the data, you know, cost anywhere between 10s to hundreds of thousands of dollars. And so a sort of final assumption that we make at least in comparing that data is that these are, these are incredibly expensive attacks that I think raise the raise the likelihood that the decrease the likelihood that things are being either filled or fabricated right because you'd have to kind of figure out a way to do so internally for internal documents in a way where you're kind of omitting these attacks that are costing the government hundreds of thousands of dollars in some cases. Andrew, can you hear me okay. Can I ask you to wrap up. Yep, totally. So what we're going to do is we're going to, as I mentioned, carry out a series of reverse replication exercises where basically we're going to take studies published in the American political. American political science review and other other studies. And we're basically going to, we're basically going to then look at the extent to which they were able to recover the results using these media based data sets. We find in roughly 70% of cases you can't. And so, and then we're going to look for a series of specific sources of missing this. I'm going to get into those in the question and answer session if you're interested, but I'll leave it there for the sake of time and and for the next presentation. Thank you so much.