 All right. Thanks everyone for being here. Welcome to our event on open social science and public engagement. As I mentioned, it was organized by the New York City chapter of the Scholar Strategy Network. Scholar Strategy Network connects scholars to policymakers, civic leaders, the media, the public, and we put on trainings and workshops for researchers to increase the reach and the impact of our scholarly work. This is one such workshop. Welcome. SSN is organized into chapters. I think there are 37, 38 chapters across the U.S. If you're not a member, if you're interested, you can apply on our website. I'll put it in the chat again. Our event today is going to help us understand how and why we want to participate in the movement for open science. How can we better communicate our research among the scholarly community, but also engage with the public audience? We are so excited to have with us Dr. Philip Cohen, who is a professor of sociology and a demographer at the University of Maryland College Park. He's the founding director of social archive, an open archive of the social sciences, and generally an advocate for open science in the research community. Dr. Cohen has a long and brilliant CV of publications, primarily about family and equality, and the popular textbook some of you will know called the Family Diversity and Equality and Social Change. And some of you may also remember Dr. Cohen's lawsuit against Trump for blocking him on Twitter. I wanted to make sure to mention that. And with this short introduction, I'll turn it over to Dr. Cohen. Thank you very much, Sophia, and the SSN chapter, New York State chapter, for inviting me. It's great to have the chance to come talk, and also nice to have a local chapter, but an audience from anywhere. So that's great. I'll start sharing slides, and so I won't be able to see probably your chats or hands raised. We're scheduled for 90 minutes. I'm gonna stop before the top of the hour, and we'll take questions and discussion. So if you have questions as we go, that's fine with me if you shout them out, if they're clarifying or whatever, or if you put them in the chat, I might not see them. But I'll do my best, and I'll leave time at the end, because I know people might have practical, technical, political, any kind of questions, and it's hard to anticipate. So I'll just try to make sure I leave time. Okay. Let's share this. How's that? Shared? Good. Thanks. Okay. So that's me. I'm Philip Cohen. I'm a sociologist at the University of Maryland. There's my context. Social Archive is an archive that I direct. It is now run out of the University of Maryland Libraries, which means I serve at the pleasure of the dean of the libraries, basically. And that means they pay for it, which is not that much, but I mean, not to diminish their contribution. But it's a low budget operation. And I'll talk more about that. So if you're here, you're probably somewhere in this category, people who want to have some influence with your work as a social scientist on policy and politics, or another social scientist, or the public in the context of enrollment crisis, budget problems, the crisis in democracy represented by that icon, et cetera. So that's what we're trying to handle. And my basic pitch on that is that the way for us as scholars to address these goals that we have is to combine being public in our intellectual life and intellectual in our public life. And so I am looking for ways to integrate these two aspects in ways that strengthen them both. So that's kind of my overarching mission in this work and in a talk like this. From the point of view of social science, open science, we have to have a level of accountability in all of the communication that we do. And the key tool or principle for accountability of science and social science is openness. And I'll talk about the parameters of that a little bit. It doesn't mean everything is always open to everybody, but openness is a key aspect of our accountability and the trust that we ask people to put in us. I have a few just contextualizing slides on some of the challenges that we face. Our work takes a long time to come out. This is the average time from submission to publication in the journals of the American Sociological Association. So from 20 months. And I ballpark this by the first round review time plus one revision review time plus the production lag. If you take time in between to do the revisions, you have to add that time here. This assumes your revision is ready right away. So 20 months from sociological theory at the slowest and around seven months at ASR, sociopath or social psychology quarterly and socios, of course, is faster, but an average of nine months. So it's taking a long time in the pace of the discourse to get our work out in peer reviewed journals. When it does come out in peer review journals, we have the paywall problem, that is who can see the work. And that's not so, that's an economic problem. And it's also a problem of just increasing friction. It's just slowing everything down so that when we share our work, not everybody can read it, then we're sharing PDFs, then we're getting temporary logins and or we're paying extra to have an open access version, etc. You're familiar with this probably. That's probably partly why you're here. And I'm sorry, can I jump in for a second? Do you mind changing your slides to where we just see the slide and not the preview so that it's easier to see for us? Oh, you're sorry to interrupt. No, no, that's okay. I thought I had done that already. So let me reshare. You're not supposed to see the preview. That's totally wrong. Yeah, we saw the preview, but it's all mostly about the size. Is that better? Yes. Thank you so much. Yes. Okay, good. Oh, good. Good. Thank you. Thank you for pointing that out. I can go back and edit in the introduction again for the recorded version. Okay, so the paywall is a big problem of introducing friction, slowing everything down and reducing equity and access. The potential for changing this system is impeded by the monopoly conditions that we're working under. These are just the 5,000 most recent sociology articles last time I checked. 80% of them are from these 5 publishers in sociology. Sage is the behemoth. Our association, the American Sociological Association, is pretty well captured by Sage and organizationally not interested in changing this beyond what might be necessary for sort of tinkering daily functions and PR. So we have a problem of trying to get structural change through under monopoly conditions, which some of you probably know much more about than I do. So with that context, we have the intervention of pre-prints, or as you'll see when we talk about source archive, papers. But pre-prints is the category in the scholarly communication system, the reference term that people use. And this is a way basically to address a lot of the problems that we have without solving them, or that is to work around the problems that we have in a way that improves our work without having to wait for the system to be restructured. So here are some of the major pre-print servers. Archive is where it started. That's mostly math and physics. It was mostly math and physics. Now it has a lot of other fields, especially computer science. Bioarchive came along after that. The X here is really a chi. It's Greek, and that's why we're saying chive. Bioarchive and then social archive and site archive came around 2016. And so that's how long we've been in this business. So pre-prints are finished drafts, but they're not peer reviewed. The papers on source archive are not all pre-prints. Some of them are free versions or accepted versions of peer reviewed papers. So some of them are not, don't fit the definition of pre-print used by everybody. So I'll explain some of that technical bit as we go. But a pre-print is like what we would call in social science historically a working paper. It's a formal scholarly output. It's permanent and citable. And it's a research event when it's published. It happens like publication in a journal and becomes part of the record. Why are we doing this? This will be a recurring theme, but I'll outline a couple of three reasons. Efficiency, engagement, and inclusivity. You've all probably experienced the frustration of doing work that you're excited by that you think is excellent. Maybe you've presented at a conference and gotten good feedback, and then it enters into this doldrums period where you're just waiting. And the frustrating thing about that is you're waiting for people who are having their arms twisted to read it while you suspect or you hope that there are people actually out there who want to read it and they have to wait. Okay. So time is a key element. For access, it goes against our principles to have our work not accessible to so many people that we think it might matter to. So it's a matter of efficiency, but it's also a matter of principle that it should be available to more people. And partly that may be because the public supports our work in terms of grants or our salaries, and partly because we don't aspire to have our work hidden from the people at most effects. And then a key element is also the connection among scholars and social scientists. Those things that slow down our work, that put up barriers of access are really also a drag on our ability to connect with other scholars. And the engagement process that we seek is not just about sort of broadcasting our work to the public, but also about building collaborative spaces with other scholars. And so reducing that those barriers of that friction is key to that. So preprints have exploded in the last 20-some years. These are the major preprint servers. I added Sir Sharkov and Sir Sarkov, and you can also see how small they are in the grand scheme. But we're up approaching 400,000 preprints per year. You can see Archive is still the biggest of them. The EPMC is the European agency that is collecting this data. So those are other things in their database. So just to show you the wave that we're riding, and in terms of putting that wave in the context of the overall scholarly output, one way to do it is just to compare that number to the total number of articles appearing in Web of Science in all fields each year. And what this is showing is that the number of preprints each year is about 14% of the number of articles in Web of Science each year. So it doesn't mean that every one of these papers, you know, the denominator is a little bit messed up, but it's just to show you the relative size. So increasing rapidly and becoming a large part of the research landscape. We've seen this, I'll give some examples from the pandemic, because we've seen during the pandemic arise in both productivity and visibility of preprints, especially in medical health life sciences fields. So I'll give a few examples from that, even though that's not what most of us work in. This was from a paper by Frazier et al. that showed the preprint and journal article production on COVID-19 related topics in the first 10 months, just in 2020 of the pandemic. And the point of this was that there was that a large fraction of the rapid, of the rapidly produced research came out in the form of preprints, mostly on med archive and bio archive. But also you can see the other ones that are included there in the chart. One of the things that this paper did, which was interesting was that they followed some of these papers to see how, what they looked like when they were published in journals, and very often published with hardly any substantive change or no change in the main findings. And so that was in a way reassuring that it implied the preprint system was accelerating the distribution of research without producing tons of junk that was not, that was wrong. However, of course, some of it is wrong. And so we can talk about that too. Some peer-reviewed work is wrong. Some preprints are wrong. Probably more preprints than peer-reviewed papers are wrong. So it's something that we can discuss and think about. But our system adapted more rapidly than, or that is the whole ecosystem adapted more rapidly than the journal publishing system did. So med archive really exploded starting in 2020 they have this caution on there that says these are preliminary reports have not been certified. So you shouldn't report them in the news media as established information. However, they were reported in the news a lot. So was that bad? Well, yes and no. I think mostly not because there was also great journalism happening. And the journalism, the good journalists out there were using other indicators of reliability for the work. Who are the researchers? They would interview other experts. You would see an article in the New York Times of the Washington Post about some preprint with three top experts interviewed. And that's essentially what we get out of peer-reviewed but in the space of a few hours. Journalists are also science reporters and many journalists are also have skills in training themselves. They do some some of their own assessment and they assess whether things are newsworthy as well as whether they're reliable. So the system is not perfect but the system worked around the slowness of the journals. Some of the big stories of the pandemic you may remember these came from preprints as death soared far more than the excess deaths were more than we were counting from the official count. That was an early paper appeared in Met Archive then published in the Washington Post and it introduced to a lot of people the concept of excess mortality and the problems of attributing cause of death during the pandemic. Another example of an early preprint that made a lot of news, the result that the delaying lockdowns cost lives so comparing places that locked down early versus those that didn't with echoes of the 1918 influenza pandemic. This also a preprint it was later as in like six months later published in a journal updated somewhat but the main finding held up even when it was published you know published later in a prestigious journal but it was in the news immediately and it was because the reporters did the work. Another example from closer to my work housework in the pandemic and so on. So these things were reported widely. What is this doing to our to our process and do we like it or not like it or should we do this ourselves? And so that's a little preamble I'll talk a little bit about sort of the theory of publishing and where where preprints come into that. Okay this slide from ASAP bio the nonprofit that was involved in the creation of bio archive and very active in the preprint area. Okay so what we normally do normally in the in the what would call the legacy publishing process journals. You write your manuscript and you submit it to a journal and then you wait. Then it's rejected just say maybe rejected you send it to another journal R&R revise wait revise journal number three okay check check check those are the peer reviews finally accepted. So months or years later you have a peer reviewed paper okay this process may be great for improving the work making sure you know that you can really that it's reliable and so on so this this you know in theory there's nothing wrong with having lots of time and energy directed to peer review peer review is great. The point at which we publish it now in this process is when we established things in the scholarly record like the in the record that this work was done by these people at this time sort of establishing precedence whose idea was it and what was the innovation what's the contribution and sort of time stamping it that's one of the functions of journals that happens after all this after this whole process goes on we also have the sort of the promotion or public function that journals play which is saying you know sort of announcing it promoting it sharing it with people sending it to libraries sort of getting it in front of a lot of readers and with the with the information attached to it that it has been peer reviewed that some experts think it's think it's good okay but the issue here is that the months and years that it took to get to that point when we could we can break this up a little but finally a journal has chosen it for selection this journal so this journal wants to share okay all of that happens after the after the months and years in the preprint a workflow the difference is that that manuscript sort of at the point it's ready to go to the journal or whenever it's sort of complete you have a complete draft that you're ready to share it instead goes into a very basic screening process not a peer review assessment of quality but you know I I can talk about the moderation process in search archive it's very light basically is it research is it is it written by who who the authors say they are and so on and then it's public okay so what does that do it we still have the same functions that we need to do in the scholarly record it's time stamping it's establishing precedents for the idea and the findings these people did this at this time it's also promoting the idea bringing it before a wider audience and saying here we did this please read this and it's delaying the later functions some experts have read it and think it's okay that's peer review and the journal selecting it saying we think it's important so it's it's it's it's separating the functions of what journals now do and taking part of what part of what journals now do when moving it earlier in the process okay now in theory when this happens that that later work can be of higher quality if all that feedback and deliberation somehow gets back into the journal workflow but even if it doesn't it's still moving these aspects of the process earlier in time all right um so what about peer review and one concern that a lot of people have about pre-prints is they might be bad and they might be wrong um is it is it is it problematic to do this does what happens to our trust are um uh either you know can we trust it or are we are we sacrificing our public trust by distributing work that's not yet peer reviewed well um I think it's worth pausing to look at the whole research lifecycle when we ask this question and this is a diagram that the center of open science which is the nonprofit that hosts that runs the platform that social archive is hosted on um uh that they use and you can sort of see the familiar um outlines of the research process you you read the literature you develop an idea you design collect data analyze publish and then that publication goes back into the process okay but this process includes a lot of opportunities for evaluation and the journal peer review is just one just one moment in that the work can be evaluated at different stages and it is in a in a couple of key ways funding agencies conferences not in addition to journals other other people or institutions that are evaluating work as it goes and then also the individuals themselves you know if you have a job if you have tenure if you're a graduate student someone has already decided um your status has already been adjudicated to some degree um and so uh you don't you don't want to um rest everything on that you don't want to assume that um someone from one school's work is better than another school or anything like that but we do carry a reputation which which contains within it um some assessment of our um baracity as a scholar so there there are other ways besides just the article being reviewed okay so that's um that's just a comment on the principle of peer review which is vitally important but doesn't just happen at that one moment when we look at this um research life cycle there are you know i'm giving this talk about preprints and social archive and where to put papers but you know this is part of an of an overall movement toward open science which includes you know many moments or opportunities for um for um the principles of open science to work and so I just wanted to highlight where we are in that where we in this talk are in that wider process um study pre-registration um is when um the your uh the research design is peer reviewed or potentially reviewed or at least time stamped before the data is collected um um the materials and methods the code and data um are shared in in in in many ways there's a new report out yesterday that NIH is going to start requiring um a data management data dissemination plan and their grants starting next year so there there are a number of places where this happens and in the publication process just one so I want to highlight that okay um so social archive well it's hard to make a graph that looks like this very exciting it's just a line going up um so I put that big explosion on the end um we just hit 10,000 papers um uh since um since 2016 this counts are since 2017 it's harder than it looks to count these papers but anyway um um uh this is where we are very small in the grand scheme of preprints with you know 400,000 per year or up to 10,000 altogether um uh but the but for any one of you who's considering submitting your work this doesn't matter um what matters is that you are putting your work um that that you're um uh that you're disseminating your work um according to the kind of principles that I'm talking about so let's let's talk about what those are okay um here's a preprint on social archive a paper sorry on social archive it was a preprint it's not anymore um this is one of mine um uh and uh it's called the coming divorce decline you can see that what you're looking at is actually the socios version it was published in the open access journal socios um um so I'm free to distribute this copy of it however I like um so so what do we what do we what do we get here um social archive is giving us a download count the number of times the paper has been downloaded from this site um it has this annotation function um where if you if you click on those uh buttons you can add comments um you know in text comments or general comments reactions to the paper um we have this plotted uh endorsement tool which is sort of like um mobile or individual peer review anybody with an ORCID ID can endorse any work they want to um by clicking on this button and you can organize that you can you know you can mobilize people to review things um that are just paper sitting on social archive um social archive um is on the open science framework which is the platform of the center for open science um and so every paper that you put up you have the opportunity to put up um an endless supply of uh supplemental materials data and code and other figures and all your appendices and all that so um that's an option that you have um the digital object identifier the doi um there are two in this case one is the one that social archive gave this paper when I posted it and then when the paper was subsequently published in socios I added this peer review publication doi um and these two do is it now being together um means although there was a lag between them that means that on um google scholar for example um if somebody cited the working paper version um those citations will be counted together with the journal version so um the um the this putting these together um in the you know in the behind the scenes that's linking these two papers um we have a variety of license options um um you can do uh creative commons non-commercial non-derivative uh etc um depending on how you want to license the paper that you post or if you put no license um you're you're still in all these cases you still have the copyright if you put no license it's just whatever says on the paper determines who can share it and when we can talk more about that if there are questions we do some simple taxonomy we leave this up to the authors the moderators when you upload your paper just make just look at it to see if it seems plausible we don't do a lot of enforcement on these um but um we have categories um uh that are um uh from drop-down menus um in the sociology section we use the names of the asa sections um in the other areas we support all of social sciences also education and law and arts and humanities we're very broad um and then there's the versioning um and this is very important for the um for continuity in the scholarly record and for um so that you're not serving up broken links to different versions different places on the internet you can see this is actually version five uh um and you can look at the old versions there um the anybody who comes to this page is always going to see the last version first the vast majority won't care about earlier versions um but they're they're under there um if you if you click on that and looked you would see the very first version I submitted uh in 2018 and then subsequently the later versions all the way up to this version published in socios I could upload a later version still the data on divorce in the American Community Survey comes out every year okay so that's sort of the architecture the anatomy of a of a of a paper on social archive um I want to give it one other example and this is from CUNY um of a way that people use social archive um we have long had in social science um a working paper series where an institution like the stone center on socioeconomic inequality or lots of other things so you probably are familiar with nber working papers the working paper series is a a cherished institution in social sciences um and uh often has involved papers that are distributed in working paper format from before the internet um and then subsequently revised or not revised and published in journals later now that we have this technology we can do a better job of um of making this process work and be um uh provide a consistent contribution to the the scholarly record let me just give you this example um the stone center has this working paper series here's one of their papers um by Milanovic after the financial crisis um um when you click on that link it will take you to this page um that is still on their site so they put this up um on july 2020 this paper was posted as a working paper you can see they've now added a link to the published version which was only 11 10 months later which is impressive i'm in the review of income and wealth so um uh what happened was in july 2020 they posted this paper on social archive um but they maintain a list of them so they're sort of they're using us as a platform to distribute their papers this is what it looked like um uh when they this is what it looks like today um on social archive you can see they put it up um uh july 2020 um it's been downloaded uh about 800 times on social archive um under this this version is under this creative comments um non-commercial no derivatives license so anybody can read and share this version um but the journal version is still um is now the um the latest version they they don't they didn't put the journal version up here they probably don't have the permission to um to share the journal pdf um but um this shows you that during that time between july and may um they they got it out they did they you know hundreds of people downloaded the paper and it went on to be published and this still serves as a free version um that people can read okay um why why not what are the what uh people are concerned it's a new thing for people um not everything um not everybody wants to do this for every paper at all times so i'll talk a little bit about sort of the issues um um some people are worried that um uh paper they have written is not good enough to share publicly um and it may not be so that's an important question um so uh so a question is um how do we handle being wrong um uh and and what are the consequences of that um and also um there's also just sort of the the managing your own dignity and respect and work um and trying to decide whether or not you think it's good enough um so um a lot of people think it's not good enough um uh or they think um they need to have peer reviewed before they um have sort of feel um permitted to share their work um um if you look at the difference between peer reviewed work and uh initial drafts usually um uh this concern is not borne out but um but that's for you to um for you to consider um i usually i think more people err on the side of being too cautious that's my personal okay um a lot of people are afraid if they post a preprint they won't be able to publish it in a journal later um this is a vanishing problem um uh most journals all the asa journals um almost all journals in social science um allow you to um submit papers to their journal after they have already been shared online if they won't do that then they're really their interest is really not the public interest and you should publish in another journal but you know people feel compelled to publish in certain journals you can look up the policies for any journal conveniently on this database or click around on the website of the journal but all the journals run by the giant publishers have policies on this now default policies so it's very rarely a problem that you won't that you won't be able to publish it in a journal um this one is um is interesting people are afraid their ideas will be stolen if they publish if they publish them as preprints um i think this is mostly backwards that mostly the the um you're protecting yourself by posting it as a preprint you're posting it publicly you're getting a time stamp you can share it widely you get a doi for the paper um and if somebody steals your idea then it's just like stealing it's just as if they stole it from a published journal a paper published in a journal they're just stealing an idea um and you have the receipts in the form of your um your preprint um i should add here's where i should add i think the problem of stealing ideas probably happens more in people's inner circles especially their mentors and advisors and committee members and journal peer reviewers and those the people who are stealing ideas are usually not getting them from the public from the public square they have some private access to your ideas and they're um and they're um abusing that position of trust so in a case like that also posting it publicly um can really can help you okay um so when should you share it if you're going to share a paper if you're going to post a preprint what is the time to do it well search archive is very flexible the other preprint servers are less so um but we are we're very agnostic very open um as far as you can you can post a paper in any stage so if you're um if you're feeling confident about it or you want to get get it going you can post a paper as soon as you want some feedback or if you want to find some collaborators and get um get some attention and bring it to the attention of a of a people in your network um if you're a little bit more cautious a little bit more cautious you can wait until you're submitting it to a conference um then you're already preparing to show it to um to some strangers some professionals and experts um but not completely um uh but but um but not waiting for it to be already peer reviewed um if you're more confident still i mean less confident still you could wait until you're really ready to show it um the time at which you're sending to a journal you ready you're ready to have it be formally evaluated um then you can just widen the circle and have to be evaluated by more people if you're most cautious you can wait until it's already accepted um and then you're then you can say i'm sharing this pre-print um uh you see this all the time on social media i'm so happy i got my paper accepted um in in a few months i'll share it with you or send me an email and i'll share it with you which people um people think that's a friendly nice thing to say but it actually creates a lot of um uh you get a lot less readers than if you just um put up a link if you ask people to send you an email so the point at which it's peer reviewed and accepted there's really no reason to wait anymore um it's been validated it's already ready to for the public now you're just waiting for the journal so you may as well just do it or at the very least the very very least you can wait until it's literally published and then just use social archive to share a free version um at the very end okay um our general um um i want to situate this um question of pre-prints and posting papers on social archive in the context of open research open science in general uh i'll make this um general pitch that science as you know is an iterative process it involves a lot of people um they all benefit from um a collaborative um environment where everybody um has access to the published record that's how science works um and um uh there's an openness um implied in the way that science works in the publication process it's just that um uh it hasn't kept up with the technology i do not mean to imply here that everything must be shared um with everyone at all times um there are times you do sometimes need to keep things private um sometimes you want to communicate just with your specific collaborators etc um sometimes you have um you know preliminary results that you're really um not sure about and you don't want to share with outside of a circle um that's totally fine it's just a question about being um purposive um in how you do this um uh sharing papers is part of this whole research lifestyle uh life cycle approach that I mentioned um um the question is how can we build our workflows around um tools and practices that will maximize um the kind of sharing that we want um without um creating huge burdens on us either administrative burdens or issues of privacy and confidentiality and so on just a lot of questions we need to build tools to make it um to um to uh increase to reduce um friction is my word of the day to reduce friction to make it um to make it easier and faster um to get better results um uh a general sort of um principle in doing this is to try to without changing your whole workflow um find a way to um essentially um you know you have folders where all your work is if you can sort of check a box when you're ready to share a particular element um that's what I mean by modular approach with a sharing layer so you're working you're working you're working and when you're ready when you get to a certain point you can say okay this piece is ready to share in this stage um I do this just with folders where some folders are shared and some aren't um and I just um I work I work out of the I work out of the unshared folders and I drop things in the shared folders whenever I think they're ready um so you know the way I the way I have that set up it doesn't it's doesn't it's not a whole long process it's literally just copying a file or dropping a file in a folder um this establishes again that timestamp the precedence I did this at this time it enables people to um to get on and look at the work without going through um a whole process of you know discussing is it okay if I look is this the version is that the version I thought you fixed it is it all that um and then um it's just sort of increasing everybody's efficiency okay that's the idea um in in making our choices about this workflow these workflow questions um infrastructure is a huge problem um it goes back to that problem of the monopolies that I discussed earlier we don't build the tools that we're using by large people love google um people google drive google scholar um google is not really your friend um it's okay to use google you know tools because they're awesome um but um you know they won't love you back in the end excuse me so uh you can find the right tools for each part of your um each part of your workflow um so the open science framework is a great um platform for sharing um uh research materials uh and so that's what I use um you know if you use otero instead of one of the other citation management tools it has a public um sharing function you put your papers on social archive we want to support the um the development of the tools and infrastructure that are consistent with our values um when all uh when whenever possible okay uh I think I have a couple more points to make about um uh um our sort of our overall approach uh this is great I love being home to do this uh okay who is our audience you know the old days people used to mail papers to other people I'm old enough to remember this and you would write on it you know this is a draft please don't quote or cite this for that permission okay you really can't do this anymore people still do it but you can um um uh the fact is communicating with a lot of people um uh indirectly all the time so we have to just try to get on top of this process instead of um instead of instead of restricting it to something that we're more comfortable with we have to learn how to work with it and knowing that there will be that it will not be um problem free there will be problems but they usually not as bad as we're afraid they will be um um the audience is less the people that you're speaking to and now um much more a network it's the people that you are reaching right now and the people that they're reaching right now um and you don't know who they're going to be so you want to you want to control your work in a dissemination um within limits but you also want it to be ready to be viewed by people that are outside of your immediate circle um the people that we deal with are not the people that we used to deal with there are journalists who really know about data who do their own data analysis sometimes with the same data um there are people who um uh journalists who like I said before marshal peer review um you may have gotten a call from journalists I have a new paper I'm looking at um will you help me decide if it's okay right and then if you know you have an interview with that journalist you may end up in their story you may not um that is not um violating the principle of peer review that's actually honoring the principle of peer review if they do it well um um we are uh increasingly uh subject to what I would just call a chaotic disciplinary mashup um uh if an economist shares one of your papers and your sociologist you may all of a sudden have a bunch of people reading it who you do not expect to read it that can be great um you want to make sure it's not terrible um um by uh taking whatever steps necessary in your to anticipate that but it's usually not as bad as you think even if it's terrible um uh we do not get to choose whether we want to share things just with our friends and not with our enemies for example um not that you know enemies is a dramatic word but you understand what I mean um it's very hard to share something just with it with it outside of outside of your immediate friends it's very hard to share something and have it be completely insulated from going outside that so we want to get on top of it instead of trying to restrict it because that's a losing battle and if we succeed we don't really want to live in that world where so few people read our work I have uh proposed this um this sort of way of thinking about all your stuff together I think this may be appropriate for something like the scholar strategy network audience um uh the different ways that you disseminate your work um uh are not just I say disseminate um but uh I want to come back to this um dissemination is not we don't want to just broadcast we want to open communication um peer review is vital for this for our work in general um it is a source of validation and legitimacy for us however it is not an efficient means of communicating right it just is too slow so um uh we want to do um we want to work around the slowness of that while still have the advantage of it which is important we want in the principle of it is very important um uh the open science aspect open scholarship aspect of your work is very important even if and it's hard to it's hard to empirically prove this but I really really believe it so I'm interested in your thoughts on this even if people do not read your data and code and that validate everything you do the fact that it's available says a lot it says that you are open to being held accountable that you are open to collaboration that you're not trying to slow other people's work down um that you're not trying to own the work um more than necessary just to do the work so the openness I think is very important in this environment where there's a real challenge to trust uh in in social science and science in general and I think openness uh is part of the accountability that we need to communicate in order to earn that trust um and so I think it's important even if nobody ever um uh replicates your work um you you combine this with a um an overall communication strategy you've got your sort of home base your website we used to have a blog um um where um that's completely under your control it's the message and the image that you want to um portray um you communicate with about your own work and other people's work um to widen the circle on social media that brings you into connection with um other friends and colleagues people in other disciplines journalists etc uh and then um by when you have this whole package together I think it helps you communicate with news media in a way that um is more respectful of their time and energy it's less um simply trying to get them to report on a piece of scholarship you did right now and more about building a relationship more about helping them in the long run in ways that where you can help each other uh it's about opening your uh signaling your own accountability etc so I I call this pentagulation I have no idea why um but it just means um that you have an overall strategy um that has a lot of different access points I do want to highlight this um the point I made about um broadcasting I think it's a real um I think it's it it does us a disservice because it's not as good as it could be when we say we want to use social media we want to use these various platforms to get our work out there we also want to get other people's work in here and we want to hear from other people and so um social media if used well um is just as much about listening as it is about speaking and um and the same goes for open science um we want to we want to get our work out there but we also want to make uh open up the possibilities of of hearing from other people in constructive ways um I saw a a thread on Twitter yesterday um the or the day before yesterday where somebody said um you know that that they had the their work was anonymously peer reviewed and then they saw it plagiarized in a paper um later um well that's terrible but what if um if the if the incentives were different and the communication structure was different that person who stole your idea um and put it in their paper um if you had connected with them earlier in a more transparent um and collegial environment you might have become collaborators I mean maybe that person is a terrible person then it doesn't matter but the point is um it's a shame when um the only in exchange you have the only opportunity to communicate you have with people is an anonymous peer review and you don't get the opportunity to become collaborators or if you do it might be years later um so uh we want to find ways to um to hear from other people as much as we want to find ways um to make other people hear from us okay so um I have no idea um I haven't seen your faces while I've been talking so I don't know what questions um you might have so I'm gonna stop um and um and take questions that go in in any direction that you have thank you great thank you Philip do you want to stop sharing your screen I do and now I'm going to good thanks uh wow that that was that was really useful I feel very inspired but there are still some people here oh yeah we have a nice group and we have some hands up so um I'm happy to moderate let's see um we I see heat go ahead uh hi that that was terrific I I learned so much I'm not a sociologist um but uh I think everything that you you shared about the discipline is you know stretches to other social sciences as you allude to frequently so uh I take all of these points uh um as really significant question I have is about audience and what I know about how journalists learn to do their job and how editors learn to do their job um is a pretty strict adherence to if it's peer reviewed we can reference it in the story and if it's not you know the journalist is going to get dinged by the editor um the the editor's going to say but you know you know but we've got this practice that we only reference peer reviewed and and it's it's it's a kind of a clear line and students learn this in journalism school I wonder if you could talk about how you have gone about trying to educate journalists and and and journalism schools about this changing landscape of publishing which which will cause them to sort of either adjust or change or make this kind of feudal um so I want to even talk a little bit about that I don't think it's true I mean I think what you're describing is not reality anymore um there are um I'm sure there are still some editors who have that view and I I know they recognize the difference between peer reviewed and not but if you just look at the news about research it's a lot of it is about work that's not peer reviewed in all the major publications they report on research it's not peer reviewed all the time um the question is how do they what gives them the confidence in its veracity um we might or might not be satisfied with the way that they do that but they go by things like does the person work at Harvard um is it somebody that we have trusted in the past do we have other experts that think it's reasonable um if you look at the other things the other research that the news media reports on it's think tank reports which are not peer reviewed it's their own data analysis if you look at the the explosion of journalistic data science in 538 in box and and the New York Times they have data they they collect their own surveys they analyze public surveys um it's Pew um there there there's tons of research reported which is not peer reviewed I think that um that uh that distinction is just being left behind and for better or worse but I just don't think it's reality right now I I think that what you're you're probably right about a small number of of news outlets that have the capacity to do that but my suspicion is that outside of the Vox and New York Times and data journalism um the the old traditions especially because your average journalist is under the age of 25 um and has an editor under the age of 30 and is incredibly risk averse um outside of the people doing the data journalism the people with um that are at the um a small number of Institute of New York Times watching to post box of course um but I think that there's a would be a huge role to play of educating outside of the um the the the elite outlets that have the capacity and training and and what you're talking about is people who are working right out of college who have editors who don't have expertise and the risk averse and there's a I think a massive opportunity for educating those people when in the exact ways that you're describing so I think maybe both are true and it's a huge chance I think for you to spread thank you and it reminds me I should share this this is a journalist resource about reporting on preprints through the coronavirus that I found um maybe you've seen it or maybe it's useful um I had looked at that a few months ago um I do think that the pandemic has changed this a lot um because uh because people couldn't wait so but no I it's absolutely right it's it's it's definitely an issue and um uh and it's also you know there is there's a there's a business interest in maintaining this distinction also because um the publishers want you to um want to reinforce this distinction all the time um and so um so if you want their PR help um uh your university's um uh PR operation might might insist that things are peer reviewed and so on so um it's not I really really really have not against peer review or I think it's I don't want to diminish the importance of like I hear what you're saying um and I agree it is something we really need to work on but I think we're being helped in the environment right now by those leading um journalism operations that you mentioned thank you okay okay some people are ducking out at noon okay um yeah just I want to emphasize that's a great point that he does making because so many of our chapters do work with local media right like much smaller outlets um who could probably benefit from learning how all of this works um Alicia you're next on my list thank you um really wonderful talk really interesting um my question has to do with um you sort of set up kind of a typology if you're more nervous less nervous about when to um you know kind of put things out there into the world and you sort of framed the most cautious option as once something's been accepted but it seems to me that from a journal's perspective that might be like the the most violating of what they um consider their their right to to publish the work and so you know I I collaborate with people who you know work in the European context and there's just such buy-in to making everything open access and it's such a deregore part of so much of their work and here it's it's an economic privilege I don't have the funding to pay for open access with most of the journals that require it and so it seems to me like I I want everything to be open access but um if I'm publishing in journals that think that they have to put something behind a paywall won't won't they be unhappy if I'm putting something on a pre-print as it's being accepted it's interesting you say unhappy they don't love you so we don't care if they're happy the question is do us I mean is it considered a violation of the agreement right so the question is what is your author agreement um and um it's quite rare that your author agreement does not allow you to share a pre-print um um and there are there are some bad agreements so um one way to look at this is say you post the pre-print before you ever submit it to a journal right um so now the question is are they willing to consider it even though it's already been right so it's not so you you don't get their permission at the beginning because they don't have anything to do with it so will they still consider it if there's already been a pre-print almost universally yes it's hard to find a social science journal that will not but it gets but that that's the easy part but then what about if you revise it if you send it you get revisions you revise it now you want to share the improved version now they're starting to get antsy because they're starting to feel like they have made a contribution to the value right so um so that's when you start to see the the author agreements change a little and some will say um you can and a common like sage wiley journals um their default policy is you can share the author accepted manuscript um that is the final approved version before copy editing on a non-profit repository um a disciplinary repository so that that you'll see that a very common language um some of them will say but only 12 months later or something like that and that would be that would be a worse agreement because you don't want I mean in the worst case the revisions that you did for the journal corrected some important errors and you don't want to keep circulating the wrong one um that's pretty uncommon but it might happen and so you really want to get that revised one out there so you do want to know going into your relationship with the journal what you're going to end up with um so I would check that database that I listed is called Sherpa Romeo um you can see the policy before you submit and see like am I going to get into a situation here um but I can tell you that we have never thank you Sophia I've never we have not received a request from a journal yet to take down a pre-print and we have under digital millennium copyright whatever that thing is called um we have to take it down if we get a appropriate request and we've never received one and we do not um we don't really enforce like we tell people I mean you have to click a box that says you have permission to share this but we're not checking your author agreement for you we're just letting people do it um if they post a journal pdf and it says like copyright Elsevier right on the front page we reject it um or at least we say to them it looks like you're sharing something you might not have permission to they might have a license to share it we just ask them to confirm so um so the smart journals understand that um pre-print versions circulating helped them more than they hurt them um so they're not really trying to get into this fight with you um okay they they may want to you know they're increasingly they are um so for example in life sciences um if you go to um bio archive there's like 150 journals now that you can submit to the journal right from you upload your paper to bio archive and you check a box and it submits it to plus one so the journals want this the journals want that the pre-print out there generally because the impact factor is how they measure how good they are as a journal um it's based on citations in the first two years right so if they've if they've got nine months of people reading and getting ready to and then their publication citing your paper in the pipeline they get a much bigger citation bang right when it publishes anyway so so it's becoming much more normal to submit like that it's part of the workflow of journals that they understand that's a pre-print and they're having to build a business model around what they can add the value they can add on top of that it's it's it's anyway so um I don't want to totally um thank you so much concerned but if you look I think you'll mostly you're fine okay good to know I'll call you if I get in trouble definitely or before I'm like I'm happy to talk about so yeah no that's great thanks wonderful I have to hop off but thanks so much for your presentation thanks for coming that's a great question Alicia um golly thank you um that was part of my question that you just answered Dr Cohen I'm a big fan on um Twitter so it's nice to hear from you in person um I'm a demographer at Columbia University and I teach a course um to masters and doctoral level students on demography but really making it sort of applied and sort of social justice oriented and part of that is uh research dissemination so my question is um what should my advice for students be basically at this time so they're trying to you know get publications they're early in their careers and so especially those who want sort of academic type careers so should I be telling them to be sure to post to prep preprints or should I advise them I basically what should I what advice should I give them in that context it's a good question um one thing I didn't say sorry about my focus here one thing I didn't say is um when you post a paper on social archive you can't take it down later so unless that turns out to be a legal problem um we don't really we don't let you say I decided this was bad and I want to take it down um because um we're not actually um we just were we're you know we have a commitment to the scholarly record and it happened and it can't unring the bell so there is an issue that I think students have to consider which is if I post a paper and it's bad because I'm you know just learning how to do this of course you would never say it's bad but you know what I mean in 10 years you might not be that proud of it let's just say or that proud of that version um so you do want to be careful about it um and uh you don't want to throw up every little thing that you work on um just because you can um and maybe you know maybe no one really wants to read it right now anyway okay so but I do think it's important to make it um an explicit part of the workflow so I would say if you're going to present it at a conference at that point you are prepared to make it public um you should you know barring some unforeseen situation you should make that paper public I mean I don't think I think if you're standing up and giving a research presentation for 15 minutes at PAA I think there should be a paper behind you that you're willing to share like I think that's just important accountability I also think that paper should have data and code behind it but even if it doesn't have the data and code behind it you should be prepared to say like you know that graph you put up with the 100 coefficients that no one can read like they should be able to read that on their own time and anyway so I think I do think that's an important part of training and socialization to say for accountability's sake at the point at which you're presenting it you should do it right um and and I think the same goes for submitting to a journal um you don't want to submit if if you're if you if you think it might be bad or embarrassing or you're not sure if it serves your interest to share it you're probably not ready to share with a journal peer reviewers either but might be the most important people in your field or the reviewers so yeah so I guess I would say to identify those cut points for when to for when to do it but I definitely think it's an important part of the training process to I mean everything I've said I would love to have everybody be trained on my second question which was related to the response that you gave earlier was I've seen in a lot of author guidelines for journals now they're saying you're not allowed to post a a pre-print at the submission stage so do I ignore that or do I do you think that's real because on the like author agreement side I understand like whatever is there but sometimes they're like sneaking it into the author guidelines for submission and so I'm wondering how accountable you are to that I'd have to look at that I mean they shouldn't they shouldn't have two different documents that they're putting out that disagree with each other the the it is sometimes there is a default the publishers have a default policy like Elsevier Wiley and Sage and individual journals may have a different policy so it could that could be what you're noticing or sometimes the journal guidelines may be out of date but the author agreement actually is what governs so I'm not sure I definitely don't I would not recommend just blowing off just ignoring something like that um uh I would I would like to get it reconciled you know I don't want anybody to get to end up getting boxed into something um oh um I sometimes happens um some some people are because of blind peer review some people are afraid that if there's a pre-print version the blinding will be messed up so an editor or an associate editor editorial assistant may google the title find the pre-print and say you have to take this down because it'll mess up um that um now then you're in that awkward position where you don't want to annoy them and make them your enemy but they're wrong so not that it won't mess up the blindness it may mess up the blindness but they probably don't have a policy that you actually have to take it down at that point um but they just makes them uncomfortable or they're old um and so they think it's wrong um and so I then you know it's delicate because you never want to say to them you don't want to piss them off because at this point your career is in their hands so that is awkward and it does happen but I think the best way you know at that point you just try gently yet forcefully to explain that it's okay um blinding is an issue um blind peer review is um not a sacred principle it's a it's a good idea it has some benefits it has some downsides um so but it but you know just like having your name having that you could have the title on a conference presentation and that messes up your blindness also um uh or you know on your cv for work in progress you know there's all kinds of ways that that these things happen um uh blindness you know in some ways you it would be ideal all the time um except it also protects bad it protects the the anonymity protects evil doers among the reviewers um and if they were being held accountable you know it might be it might be a fair trade-off to say that's a different issue but you know we're going to identify authors and then if people persecute them because they're from marginalized groups or they're um less famous or important um then we can you know then we can hold those people accountable who did that anyway good question really good question i'm also happy to discuss if you want to show things like that to me i love those kind of investigations obviously good question great thanks yeah thanks for that um i was also thinking what philip said earlier about the the junctures at which you're most likely to have your work stolen and if you're a graduate student uh if you're an advisor i'm sorry that's the most common thing i'm sorry to interrupt no that's where i was going with that um diana had to um my fellow co-chair of the chapter had to go to another meeting but she left me a question to ask um so um she's in public health and she says it would be great if you could provide some sense of the percent of papers of participation from the public health medical science disciplines um i don't know if you have a sense um it's interesting there's a little bit of a distinction between public health and medical um um because anyway there there's they're different they're you know they're different um i i can't give a percentage i know in the case of covid 19 papers um we can say it was something like 30 percent were preprints um but um you know in 2020 we have that one paper um but it's hard to say and i don't i you know if you look at like what they call life sciences i think does not really include public health right um but what they when they say but med archive does take public health like i wrote this paper about high heel shoe injuries um and i put that on med archive um that was fine uh even though it wasn't you know it wasn't medical but it was public health so i guess i don't know exactly what the difference is between medical and public health it's something about the independent variable and the dependent variable if it's a chemical if it's a disease no but we do diseases it's a good question great thanks um thanks for that um any other questions i see sarah just turned on her camera oh yeah sorry sarah go ahead please oh hi there thank you um this is super informative um i have two logistical questions one is the hypothesis annotation that you mentioned that you can write notes does that notify the author of people's notes or is that just your own private notes um it's funny that you say that it's a little bit of a authority point with hypothesis um the author is not notified when you put a comment on their paper um i really wish they were because and it's actually i don't promote this feature that heavily because i'm not sure people want to post their papers and have comments appear that they're not notified of right like if somebody posts a whole thing about why your paper is wrong i'd like to know right away um people don't use it very much um for that so but hypothesis is a great tool but the interesting and so hypothesis is is laid on top of our platform and it's a it's a nonprofit um operation um the great thing but the thing about hypothesis is they don't um the owner of the document they don't have a field in their database for author basically so they don't know who the author is of the documents so in their database it's associated with the site that it's on but you can use hypothesis to comment on New York Times articles um or anything anything on the internet um and the so but the but as a user you have the option of making the comments visible to the public visible to nobody but yourself or you can create groups um and it's a really nice thing um they use it for teaching a lot how the hypothesis has been really promoting this you can create a group of eight people and all comment all over a paper but the public can't see the comments but you all can so um you know it you were very close to it's i mean you could you could run a journal you could run a journal on social archive by um asking people to write reviews using hypothesis and then when you get three positive reviews you post a link on a list of papers and you call it this is my journal um and that would be what we call an overlay journal overlaying on top of the preference server and there are these in math and physics on archive there are some journals like that where the papers are just sitting on archive and somebody puts up a list it's sort of like here's my list of my favorite papers but instead of just my favorite papers it's my favorite papers that at least you know a certain number of people had said were good and i've reviewed the reviews and i've decided it's good and here's my really good list of papers as a journal um so it can be done with hypothesis thanks and the other logistical question is i don't i'm not familiar with licensing so i don't know if the is there like a standard one or the most commonly used yeah um uh i actually have a slide about that um which i can share um it's sometimes you sometimes start losing you know people when you talk about this um so i'll just i'll be brief um uh so uh cc zero is anybody can use it for anything right um and uh it allows people it puts no restrictions essentially um now key thing is in academia of course cc that doesn't mean that you don't have to cite it or give credit like that those are ethical not legal obligations so the ethics still apply right just because it's cc zero doesn't mean i can put my name on it and submit it to a journal that would be unethical right okay and that would be plagiarism okay but that's cc zero cc buy is just attribution anybody can share it but they must link back okay and then you get into the more restrictive cc buy but with non-commercial or non-derivative that's nd so you can't cut it up and modify it non-commercial is you can only use it for non-profit purposes and share alike means if you're going to redistribute it you have to redistribute it under the same conditions the same with the same license essentially so we offer all of these and there are some variations on them um cc buy is basically what i use for just about everything so my blog um this my talks i i do cc buy um um you could say non-commercial there's a little bit of a problem you know they're hard to enforce obviously um there are publishers who um scrape up open access articles and publish books um and say like here's an edited volume there's a publisher called apple apple academic a lot of the books they publish are just collections of articles that are open access articles um and they sell them for money um the trick is they don't really sell a lot of them for a lot of money so and when they do it doesn't really hurt the authors it's so it's offensive it violates our norms but it doesn't really hurt you um so i don't mind so i just use cc buy i wouldn't say if you said nc then that publisher could not sell your paper but i don't really care if they do because um the open access version is still open so it's a shame if they're fooling people into paying for something that they shouldn't be paying for but it's not you know but they're bad people and they shouldn't do that but it's not you're not really like it's not that bad you know it's just not like it's not your job to stop that in a way so i recommend cc buy long story short great thank you any others um do you want to stop sharing so we can see everyone see yeah it's okay good all right um last call for questions if not i don't um galleen says um where do um popular population health did margar for submit you mean um she gone yeah i think she she means which of the archives yeah yeah um social archive is is um uh is good for that um the the other popular one i guess would be ssrn the elsevir preprint server um um but i don't know there's nothing else there's nothing specifically population health um uh oriented besides us great anyway good no questions from the audience i have a question yes so i'm a primarily qualitative researcher and so some of the advice that you gave us you know is about sharing your code sharing a data set you know there are a lot of conversations and qualitative you know methodologies about like ethics around sharing data you know particularly working with vulnerable populations so i wonder if like can you share anything that from like qualitative folks who are really into using social archive like how do they think about it what questions come up that we should think of if we're qualitative well i mean if you're just sharing papers it's not that big an issue right it's more about research materials um you know there whatever ethical issues there are with confidentiality and so on with papers maybe you don't there i mean there is the the issue you might want to be more cautious about when to share if you're not sure about you know how you're blinding things or whatever so but but as far as the data and code the research materials um so there's a couple different approaches um one sort of radical approaches name people right no and you may know that people think you should do that like journalists do um just ask people if they're willing to be named and just name them okay um not just but do that um um uh that still might be different from putting like the full transcripts right you still might you might name them but still edit but still limit what you reveal about them or something like that so that's one approach the other is the research materials are also your um interview guide your coding guide your the other stuff that does not include the confidential data your recruitment materials so all that can go into the open science category um and then you know it's much more work but you can also then like blind your transcripts um uh and and share them um and there are um there's a there's that archive at Syracuse that hosts tons of this stuff I forget what it's called and they can work with people on that um I can I forget what they're called um yeah what I actually think we should maybe do and it's not my area but it occurs to me that for accountability and transparency um I think we should have like um there should be a way for reviewers to have access to all to everything but not the public right like like you know I think there should be an intermediate category where 10 people can read all your notes and transcripts but under an agreement that they won't share it beyond that or something like that instead of just the book um where everything's already been cut down and I just think you know I'd like to see that something like that set up I think that would be neat um I would be willing to um uh I think you know that also we do that with confidential um quantitative data to some degree right you get permission you sign an agreement you'll keep it in a certain you know uh uh in a cold room or whatever so I think we could do that with transcripts and um and qualitative materials also um just for purposes of accountability or also people can find other things um I mean I I know there's a whole um there it is the qualitative data repository circuits thank you QDR um thanks Peter um you know there's the other issue with ethnographers which is um if just because you read my transcripts doesn't mean you can analyze my data right like the data is me I was there it's the experience I'm probably not representing as well but you know what I mean that there's no there's no such thing as reproducibility because you couldn't have the interaction and the experience that I had as me at that moment so reproducibility is not really the issue but maybe um something like accountability is um where what if yeah you still you still might have lied in what you in what you reported um or something like that so or been wrong great thanks that helps me think about it but um but yeah I think I think that's pretty much all we have thank you so much for visiting us oh my pleasure really fun yeah and thanks everyone in the audience for coming and sticking to the end uh we will share the recording so you can share with colleagues and uh yeah I just I'll just stand here in front of this truck we're gonna talk all day so