 Next up we have Reese Richardson who's gonna talk about journal hopping by research paper mills after a preferred journal is de-indexed. Okay, hi everyone. It is wonderful to see everybody here today. And thank you very much to the organizers for the invitation and for putting this all together. My name is Reese Richardson. I am a PhD candidate at Northwestern University in the laboratory of Louis Chomerul. And I'm here to talk to you today about paper mills and how they evade science integrity measures. So there are three types, as we see it, of irreproducibility in science. There is what we've been talking about this whole conference, stuff that appears in legitimate research, stuff like lack of data availability, incomplete methodology, bad documentation, stuff like that, overinterpretation of results, causal claims, generally this category covers things that appear in legitimate research but aren't done in a deliberate attempt to mislead. And then there's what we call small-scale fraud. This is when a single scientist or their laboratory decides we're gonna fudge the numbers in this paper, we're gonna manipulate these images, we're gonna cut up these Western blots like this cancer research group did in this figure that you see up here. We're gonna pretend we had all these participants when we didn't, et cetera. This category covers deliberate attempts to mislead, but these papers only enter the literature generally one at a time from the same source. And then there's what we call systematic fraud where deliberately fraudulent research enters the literature hundreds or thousands of papers at a time. This type of fraud is governed by research paper mills, which we heard about earlier. These are organizations that write and sell fake manuscripts to their clientele to get them quickly published. Paper mills can offer a lot of different services. They can sell manuscripts, they can sell fake data, they can sell individual authorships on pre-written manuscripts, they can sell undeclared editing services, they can sell peer review to you, getting you through the peer review of the journal. Here's an ad that we found from a paper mill on Facebook and they're selling authorship slots on a set of publications which will go on to be published in a Scopus Index Journal. So to investigate this problem in a numerical way, we collected a corpus of suspected paper mill products from lists compiled by experts on paper mills, mostly science integrity researchers. I don't have time to get into exact sources, but around 17,000 publications are in this corpus, having been labeled by experts as having likely paper mill provenance. We compared this list of suspected paper mill products to the size of the scientific literature at large using open alex. This graph shows the size of those corpora on a log scale by publication year. You can see that suspected paper mill products are multiplying at a much faster rate than the growth of science overall and notably suspected paper mill products now outpace retractions. And remember, this is just detected paper mill products. The true number is likely much larger. There's no telling how many we're missing. We truly don't know what we don't know. If we return to advertisements from paper mills, you'll notice a common theme in a lot of them, which is these aggregation services or indices, stuff like PubMed, Scopus, Web of Science, places where people go to get their literature. The paper mills want you to know that they can get you published in one of these indices. And in other indices as well, like the Nature Index, the UGC Care List, et cetera. This is because for many paper mill clientele, indexing is very important. Often your employer requires it and all sorts of stuff used in promotion decisions like metrics, HIndex, citation count are obtained from these indices. And sometimes employers will only count papers that you've published if they're in Scopus Index Journals or Web of Science Index Journals. So based on this information, one would assume that de-indexing journals, removing journals from these indexing services that have been compromised by paper mills would be an effective way to impede paper mill operations, to stop them from growing, to stop them from spreading. And de-indexing has been used to massive effect before. I'll highlight the case of Anko Target up here. This is the number of publications that appeared in Anko Target published each year. Very popular cancer journal. Around 2013, 2014, paper mills started to invade. There's something like 100 suspected paper mill products and many more that we believe have just gone undetected. And concurrent with that time, this journal exploded. It rose by a factor of 100 in the span of just a couple years. So folks at Clarevitt and the National Library of Medicine noticed these trends and noticed paper mill activity. And in 2017, Anko Target was de-indexed by Medline and PubMed. And in 2018, it was de-indexed by Web of Science. Immediately, the journal collapsed in size, taking the paper mills that had invaded with it. Anko Target is the sixth largest journal that's been de-indexed by one of these three services. And if you look at the other members of the top 10, you see the same pattern. There is this rapid expansion, the journal is de-indexed, and then there's a mass exodus from that journal. After we observed these trends for a few different related journals, one right after the others of this expansion, de-indexing, and then de-growth, we thought when paper mills suddenly invade a journal, they have to have come from somewhere. And when paper mills leave a journal, the paper mills have to go somewhere. The goal of a business is to continue to stay in business. So this led us to the question, if a journal gets de-indexed, can the paper mills just continue operations somewhere else? We call this concept journal hopping, and we started looking for examples of it. We found lots of instances of rapid growth in journals and rapid de-growth in journals that we suspect were the result of paper mill journal hopping. But we knew that the most compelling evidence that journal hopping was a phenomenon that paper mills could use to evade science integrity would be seeing a paper mill doing it in real time. We were lucky enough to find one on the regular internets and unlike other paper mills, this paper mill lists the journals that it operates in on its website. It advertises which journals it can get you published in. And importantly, this website has been crawled by the Internet Archive 18 times since October 2020. And here's that crawl, little bit hard to see, but what you should see here are all the dates of crawling on the bottom. Each of the lines in this plot is a journal extending from the time it first appeared on the paper mills website to when it last appeared on the paper mills website. And the dashed lines indicates when it was de-indexed. None of these journals were de-indexed by Web of Science or PubMed. All of them, the only ones that have been de-indexed have been de-indexed by Scopus. There are 143 journals here and there is clear evidence of journal hopping here. This list of journals rotates over time. I'll direct your attention over here to the top. All of these journals were on the website. All of them were listed as being indexed by Scopus. And then sometime in 2020 or 2021, a bunch of these journals got de-indexed. And then in April 2021, they got removed and replaced by new journals. The effect of that is that they have a constantly rotating menu of more or less constant size of journals that you can pick from on their website. One thing that makes journal hopping very easy for paper mills is that de-indexing is a very infrequent event, very infrequent here. I have traces in log scale of all of the active journals in each of these three indices, Web of Science, Scopus and PubMed. And then here we have all of the de-indexing events of each of these indices. So whereas each of these services have something on the order of 10,000 active journals and growing, the number of journals that get de-indexed is more or less constant at around 100 every year. Moreover, the number of journals with suspected paper mill products is much larger than the number that gets de-indexed. And again, this is what we detect. Those detection rates are low and very heterogeneous. So in summary, paper mills are businesses and like all businesses, they want to stay in business, they want to grow. The measures that are commonly used to keep them contained are rate limited and can be easily evaded. So if that's the case, we should only expect that the paper mill problem will continue to get worse. And because of that, we should critically reevaluate the existing tools that we have to fight paper mills and quickly develop new ones. We need to adapt faster than paper mills can. And with that, I'd like to thank you all for listening. My name is Rhys, you can find me afterward. Thank you for listening. Thank you to my lab members and our collaborators and the folks that funded this research. And we have about a minute 45 for questions. Hey. Thank you so much, fascinating work. You mentioned that journals were the index. One can explain that as some kind of a sanction. Has there been sanctions imposed on authors who were involved in those papers? Very, very infrequently. There have been a lot of sanctions by the Chinese Ministry of Science and Technology very recently due to the work of oftentimes volunteer stewards of science. But these events are very infrequent. For instance, a lot of paper mills are coming from Pakistan. And there are very little sanctions or punitive measures at all placed on individual researchers in Pakistan and Iraq and Iran. So again, very infrequent, very heterogeneous. Janager from Berlin. Just to add to your story, two days ago there was a paper posted in Met Archive by a German group. First also, Bernard Saber. There's a commentary I think today in science about it which delivers some numbers. And it seems that the problem you just exposed is much bigger than we think in terms of 10 to 20% of the literature by now being fake produced by paper mills. And so a second comment would be, I'm not sure whether we, just exposing them really helps because we will get into an arms race of technology that produces fake papers and technology that detects fake papers. So we'd rather I think go after the mechanisms. And I think the two main mechanisms of course are publish and perish from the author side and on the publisher side it's the APCs. So we have to get the business model of the APCs. That's what makes these mills run. So if we dry up the APC route, we dry up those paper mills. 15 seconds. You're exactly right. Expository and punitive measures, we know they only do so much and as the show is their rate limited. We need to be working towards things that drive people towards paper mills because as long as there are clients, there's money to be made. Thank you very much for listening. Thank you.