 Hi, this is Philip Cohen here with a short informal talk about COVID-19 preprints and the information ecosystem. You can see my contact information there. I'm happy to discuss this with you if you'd like to get in touch with me anytime. I like this cartoon because it depicts sort of where, how we informally judge the seriousness of what's going on. If they're talking to a volcanologist on the news, that's pretty bad. If they're talking to an archaeologist, probably not too much to worry about. You can see as far as COVID-19, virologist is pretty far over there towards a volcanologist. So that's potential trouble, and that's what we're talking about today. Just a little bit of background on information ecosystem, sort of the analogy that we use from ecological studies to talk about how information works through society. We start at a foundational level with some key structures in the information landscape. Those are things like government agencies, news organizations, the publishers that publish scientific research. Those are the structures that information is built on. And then there's a system of access for how we're going to get those things. We use internet, we use devices, we have subscriptions and there are paywalls, and of course it has to be in a language you speak and so on. Then there's a marketplace to varying degrees of marketize, but there's a marketplace layer on top of that where we generate sort of the demand side. There's information needs of people searching for information, people wanting information about things like public health, consumer information. There's the production and movement of information around that goes on in that marketplace where providers generate and disseminate information through using their tools and platforms. And then there's the actual use of the information by the people who are sort of on the receiving end. And then it flows around in the system and it flows around through a network of influencers and that's partly based on trust, based on people's decisions on the individual level or organizational level about who they're going to trust, what information they're going to use to serve their needs and to try to provide information to the people that might be relying on them. And finally there's the impact level. So altogether this is a large, complicated, dynamic system. It's hard to get a grasp on, but I wanted to just give sort of a sense of the layers that we're working at, that we're looking at in this system and then I'll move on to talk about what's going on with the pandemic and science now. Important to realize when we talk about peer-reviewed science and the way we assume science works, it has not always been this way. There's a famous interaction between Albert Einstein and the Journal of Physical Review where he was extremely distressed to discover the editor of the journal had sent his paper to somebody else to review it before deciding to publish it. It turned out this particular paper was wrong and had to be corrected, but Einstein was offended and sensed on the basis of this incident. I prefer to take my paper elsewhere. He didn't authorize it to be shown to anybody else. So that's 1936. Peer review, especially anonymous peer review, was completely offensive to Einstein, of course, he's Einstein. But the general case that before the middle of the 20th century, journal editors were the ones who decided what to publish and they consulted with experts at their own discretion, which doesn't mean there was no gatekeeping and it doesn't mean there was no authority and no expert review, but it was editors who were in charge of it and they decided who would review what. And they were the gatekeepers. Then after World War II, as there was the threat of the constriction of research funding, academics realized that they could lose control of this process if it was handled instead by other government agencies or so on, without their direct influence. And that's really when we see the mystique of peer review come up where academics really insisted that only academics could decide what was trustworthy and valuable. And that's when we got this idea of the anonymous peer review system and especially the idea that this was essential for science to be considered trustworthy. This is a good example of the Watson and Crick paper on the double helix structure of DNA, not peer reviewed. The editor just decided to publish it. Okay, so let's look at the process of science working and then we'll break out the communication aspect of that. So we have this body of published results and the scientists who are getting ready to do their research, go through a search and discovery process where they find the relevant information from previous research for their area of interest. They develop their own research ideas, design projects to do, collect or gather the data that they need, analyze it and then publish results again. What they published then goes back into the pool of established research and this cycle goes on. Now you'll notice here that if you think about how this works, it doesn't mean that everything publishes always right forever. Things are superseded, things are added to and the process is not really cyclical, but it's really expands as it goes. Now, the peer review process as we know it in journal publishing happens really at that point between the analysis of the data and the publication of the results. That's when we bring in these expert peers, sometimes anonymous. This effort is usually coordinated by publishers, especially journal publishers. And they're going by their own status criteria, who are these researchers? Are they reputable people? Where do they work? And then they're also consulting their peer reviewers. The process is very black box. We don't know exactly how it works, it all happens in secret. Now, there are other places where evaluation goes on in the system. We don't think of it as peer review in that specific sense. But after results are published, they can be elevated. For example, they can be given awards. We have grant funding agencies that evaluate research ideas and decide whether or not to fund them. We have people who decide who's gonna get jobs and who's gonna have important jobs and so on. So there's a lot of sort of peer evaluation that goes on in the system besides that moment of publication that we tend to focus on. Now, when we talk about open science and we're moving into what's happening now with preprints, open science advocates have seized on a few opportunities in this cycle to open and accelerate, to make more efficient and more transparent the scientific process. So for example, study designs can be peer reviewed in a process called study preregistration, where people can evaluate the design before they see the results and decide if this looks like a credible study. We have a lot of efforts going on to share materials and methods in repositories and so on that other people can use to enhance their own work. And then the publishing itself doesn't just happen in academic journals but can include preprints, which are more rapidly disseminated. We'll talk about next but also news media and social media and all the other ways that scientists disseminate their work. So let's talk about preprints, what are they? Sort of two main definitions of preprints that are relevant. One is the finished drafts of scholarly work but not yet peer reviewed. So think of the version of work the scientists are sending off to a journal to be reviewed. Another definition is, after it is reviewed and it's accepted but it hasn't been typeset and published by the journal, there's that version and people sometimes share that, that also can be called a preprint. So preprints are essentially before the typeset version that comes from a journal and in some stage of development. You see the major preprints, some of the major preprint servers here. Archive is the first one, the first big one comes from math and physics started in the 1990s. Now with well over a million papers, bio-archive and then med-archive came along in life sciences and biology and medicine. And then in social sciences we have social archive, that's I'm the director of that. Sci-archive and a number of other discipline specific archives that all sort of work on similar principles of taking preprints, archiving and disseminating them in a more rapid, more open way than traditional journals. Now in the current pandemic environment there's been an explosion of research and much of it in preprints. This paper by Fraser et al finds in the first four months from January through April 16,000 papers published having to do with COVID-19. 6,000 of those are preprints. So a huge volume of work on pre peer review coming out available to the public. They also did some follow up and found that a lot of these studies are published quite quickly in journals with few substantive changes indicating the preprints were pretty good in pretty good shape when they were posted. Some of these preprints have been shared thousands of times or tens of thousands of times and really helped shape the discourse on COVID-19. Some of them have turned out to be wrong and there's been controversy with some of them and that's been part of what's raised the issue of should we be reading preprints and talking about them and sharing them and so on. Now because we're dealing with medicine and people are very sensitive about, you're gonna give clinical advice, you're gonna talk about specific medications and so on. Doctors might be looking at these, the Med Archive and Bio Archive together, they're run by the same organization. They're putting this caution on their home page now, caution. Preprints are preliminary reports that are not certified. They should not be relied on to guide clinical practice, should not be reported as established information. But preprints are all over the news and so that's happening. How is it happening? Well journalists and news organizations are using their own indicators of reliability. Who are these researchers? Where do they work? We'll talk about the Columbia study. They might talk to other experts who are not involved in the research or not directly involved and say what do you think? Does this reasonable? Should we write a story about this? They have their own assessment of science journalists and reporters and editors have their own expertise and they can read and decide what to trust and then there's the newsworthiness, spelled wrong. How important is it? Do we need to get this out there right now instead of waiting weeks or months for it to be published in the journal? So some of the things you've probably heard that are now sort of established knowledge in the public sphere around COVID-19 have come from preprints. For example, the estimates that the number of deaths going on in the country is far exceeds the number counted by the official statistics on COVID-19 indicating we're undercounting infection related deaths. Probably page one of the Washington Post. Page one of the New York Times, if we had done the lockdown sooner, tens of thousands of lives would have been saved. That's also from a preprint published on Med Archive. So these things have been tremendously influential. Lots of people know about them. Here's another one, men are doing more housework, says researchers. Well, this one not on a preprint server is published by the counseling contemporary families that does their own bedding of pre-published research. They publish it themselves but not in a journal and with a formal peer review process. Now, some of these preprints have been wrong, but then again, some of the published articles have been wrong. Here are some retractions from major journals like the New England Journal of Medicine, The Lancet, Anals of Internal Medicine, which have turned out to be wrong and had to be retracted after getting a lot of attention. I include one here bioarchive preprint that was withdrawn after suggesting that the coronavirus was made by humans because of a faulty analysis of the genetics of the virus. So there have been errors among preprints and among published papers and we have different systems of trust and authority and the whole thing is moving very rapidly. So what do we trust? What do we believe? We know this whole system runs on trust. Normally it's formalized and we have authorities that are sort of predetermined to be things we should trust. Science and nature and other important journals and big universities and so on. And so we use the status and the reputation and legitimacy of the people who did the study and the people who published the study, the people who report on the study. There's a constant churning process of deciding who are we going to believe. Peer review is just one part of that and this crisis has really shown that. The preprints have come out, have been tremendously successful at accelerating the pace of research and the efficiency of getting that knowledge out there. And it has really taxed the system of trust and made more explicit part of what's usually implicit. And one of the things that we now look at is transparency. One of the ways that we evaluate whether or not we should believe a piece of research is are they sharing their data and code and so on. Have they made their process transparent? If they have, that indicates that they're more trustworthy. Now, I want to talk a little bit at the end here about what you can do if you are a researcher who works not just on COVID related but anything. One thing you can do is take steps to publicly endorse work and make sure you only share work in your networks that you think is reliable. So for example, the plotted tool I really like. It's a Chrome, the browser extension you can install. I use Chrome where you can, if you have an ORCID ID, you can endorse any piece of scholarship. You can see here I have endorsed the Watson and Crick article. Also, if I'm going to share a paper on Twitter, for example, here's a paper I shared about dog adoptions during the pandemic, I'm giving it my endorsement if you trust me from other, for already, for other reasons, then my tweet might mean something to you. So if you're a researcher, I just urge you to spend some of your time doing the work of publicly endorsing things that you think are good. I also think it's important for us to try to understand this ecosystem that we work in. And then to make decisions with how we allocate our own labor to support those organizations and institutions that support open science and support our values, to practice open science ourselves, to publish preference when they're ready to help accelerate and transfer transmit knowledge out to other researchers in the public, and just decide who we're going to edit, journals, review, write for, to think of ourselves as part of this system and try to push the system in a positive direction while we do our work. Just a couple of final readings. Readings I can suggest for futures for researchers in general, the Open Science by Design Report from the National Academies. For journalists, this is a nice resource on when and how to cover preprints. For people in social science and sociology in particular, I've written this report, Scholarly Communication, that maps out this system more from the point of view of sociology. So that's quick overview, preprints, COVID-19, the information ecosystem. I hope it was helpful. Bye.