 Just a quick walk through our table of competence. First, we're talking about what is reproducibility, followed by a crisis in reproducibility and the consequences of that crisis, some of the causes and drivers of that crisis and the consequences, why it matters for you. I'm hoping this is becoming obvious about given the title of this session, this should not be coming as a surprise. And finally, time at the end for questions and answers. So there's plenty of options for interaction throughout, but there will be time at the end dedicated to question and answers. So if you get right to the end and think, wait, I don't understand this key concept, go ahead and share. And because you can do it through mentometer.com, it will be anonymous. So what are we talking about? Let's dive right in. Broadly, reproducibility is about getting the same results as published work or at least previous work. And exactly what counts as the same is tricky and everybody will define it differently. Some people want exactly the absolute same down to five digits past the decimal point. Other people are talking about, do you show the same general trend or pattern in the data? Do you get the same, does someone else's results with a completely new set of observational studies concur with previous results, that kind of thing. But broadly the same. And it's an essential part of the scientific method. In fact, it is what puts peer into empirical research. But a bunch. Joe, I didn't hear you laugh. I did laugh. Okay. Right. But really, like the whole point of the scientific method is that you sort of, you know, document your whole process and you put it out there so that other people can go, huh, I didn't know it would do that. And they can do it on their own. They can get the same results and go, he wasn't lying. Now, that might be broadly about repeating an entire scientific work from data collection right through to creating the graphs that you put in the published piece of work. Alternatively, it might be just about narrowly recreating an analysis. So you might accept their data and do your own analysis with it, maybe changing a couple of things and seeing does that really matter if the end result or these sort of inconsequential changes. So, you know, reproducibility has a couple of terminological distinctions that some people make. Repeatability or replicability is broadly associated with the data collection and experimental steps. So kind of the early stuff, how you got the data, you know, what experiment you set up. Whereas reproducibility is often more closely tied to validation and the data analysis or interpretation of the results. But I'm going to use reproducibility to mean any or all of these parts because they're not entirely separable and it doesn't really matter for the concepts we're talking about today. So I'm going to say reproducibility, it means all of this. You can shake your fist and tell me why I'm wrong later in the Q&A session. Importantly, reproducible research is like a good recipe. It is not like the technical challenge in Great British Bake Off. You see in this image from the Great British Bake Off, they were all supposed to produce the same thing but with like a not very distinct recipe. And clearly they've all come up with slightly different things. They're all different sizes, some are darker, some are lighter, some are round, some are oval. You know, that variability is fundamentally not what reproducible research is looking for. So a good reproducible research recipe should include things like data, the details of how you got it, how representative it was, things like that. The tools you use, including the materials, the software, the packages, all of the decisions you made after you got the data, so how did you clean it, how did you process it, recode it, all of that kind of thing. Your results in as clear and objective a way as possible. So not the interpretation, that will be very hard to reproduce. People just have to believe you on that, but your results, the descriptive statistics, the statistical models that you got, things like that. And access, if it is at all possible to the data itself in both final and raw versions, any code or notebooks or recordings, transcripts, all of these kind of things, if it is possible. Won't always be possible because of confidentiality reasons, disclosure control, but if it is possible, please share. So I want, before we go anywhere else to ask people, have you tried to reproduce scientific research? And these are sliders, so you can put them from, it was a complete failure to it was complete success for reproducing the work of others, reproducing your own work or never even tried. And I don't know where on that slide are you gonna put never even tried. That is I guess a complete success. But yeah, lots of people not even tried. This is encouraging. It's also interesting that some people in efforts to produce their own work have not had resounding success. That is a useful thing to point out. And that is in fact one of the reasons that reproducibility is so important. So yeah, we've got some good answers here. People participating. Some people definitely have tried to reproduce scientific research of others. And I find that interesting. I don't know that I've ever really tried to reproduce scientific research of others. I've definitely taken their methods and applied them to my own research, but I didn't test it on their own research question, their own data to check the method, I guess. Just kind of trusting that they know what they're on about. And maybe that isn't very useful. All right, so moving ahead. Let's take a moment for Q&A. So you can do this through Mentimeter or Joe can enter the questions you've answered in other platforms into Mentimeter. And they will come up. Everyone will be able to see them, but they will be at anonymous questions and I will try and answer them. So maybe you're unclear on the terminology or maybe on what the difference between trying, you might have a question like reproducing your own research. What does that even mean? Is my research, of course, I reproduce it. I don't know. Go ahead and ask questions. No one has any questions. How useful do you think conceptual replications are? That's a tricky one, conceptual replications. I assume this means where you sort of, rather than try and reproduce the exact results, maybe even observational studies, you're never gonna get those same people to walk past you so you can observe them again. But you try and observe another set of people and produce broadly the same results. I think that is very important. And it is also very important to be as clear as possible in the original about how representative your data collection was or if nothing else, what decisions you made about your data collection. No one will be able to conceptually replicate your findings if they don't understand what it is or sort of what it's about. You're kind of without any hope of replication, you're just hoping people trust you and it'd be nice if we all trusted each other but I think we have reason not to all the time. Oh, hang on. Don't know how to see these other questions. Oh, there we go, Mark has answered. Why do you think reproducing the work of others is more successful than reproducing my own? I suspect that that's because when we reproduce our own research, we kind of trust our memory whereas when we're doing it from something else, we're really explicitly following an instruction that has been published. I could be wrong. It could also be that things that matter, you never thought to write down and so you don't include them in your reproduction. I don't know, I'd be interested if anyone else has thoughts on that and you can enter them into this Q&A as well. When we're speaking about scientific research, are you also referring to social science? Absolutely, all scientific research should be replicable to some degree. Social science is more likely to do things like surveys or questionnaires or semi-structured interviews or observational studies, things like that that cannot be replicated in the same way as maybe chemical studies where you add a certain amount of chemicals together in certain lab conditions and observe the result. They are different kinds of reproducibility but they're both very important. Actually, this is a very prescient question that comes up later in the session. Where would you recommend we publish the recipe? Is it something you would add as supplementary material to a publication, et cetera? In theory, the recipe should be the method section of your paper. Not all journals require a good method section and supplementary material, absolutely I recommend. That's where I would put your raw data if you're allowed to share that, your finished data if you're allowed to share that, any code you use to move from raw to finished data, yeah, any kind of software or absolutely. I am a big fan of putting Jupiter notebooks and data files and things like that on GitHub repose. Other repose are available, Google Drive in theory, Dropbox, things like that. You could make these publicly accessible so that people can just have a nosy. See what you did? You can be as explicit as possible with no word limits. So that can really encourage you to be clear. That looks like the end of the questions. Here's comes another one. So if it applies to social research, are you talking about reproducing the methods with different populations? That I think is what I meant with the first question about conceptual replication. If you're reproducing the methods with a different population, that is arguably more of a conceptual reproduction of the work, whereas a very strict kind of exact, exactly the same reproduction would be the same methods, the same populations. That's not always possible. We do get into this a little bit more about how social science research that is different maybe from hard sciences and how reproducibility differs. Yep. But yeah, reproducing the methods with different populations is still very important but getting the same basic answers using the same methods on similar, if not the same populations is also important. They come up with different things. One is more like validating the concepts and one is validating the methods. Do you see what I mean? So those are different things and they're both important for reproducibility. And I'll just give this one more minute to see if anyone else has a question. If not, I will move on. Yeah, I think you're not wrong to point out the difference between social science research and maybe some of the more classic hard sciences research and how reproducibility is a bit of a contentious issue there. So I'll go ahead and dive in to specifically social psychology, that's a journal that set out to reproduce a bunch of the classic published articles from like the 1950s. So all of these psychology and sociology and sort of behavioral studies that had been published in the 50s, 60s, 70s, 80s kind of thing, sort of in the early 2000s, this journal thought, Peter would be really great if we deliberately set out to reproduce all this stuff. Unfortunately, they couldn't reproduce most of it. It was really embarrassing and it came to be known as a replicate. Now, more than, more recently than that, people noticed this replicate thing and they reached out and they interviewed a bunch of people and they really tried to find out what's going on and they found that 70% of researchers had tried and failed to reproduce another scientist's experiments. And that's probably broadly reproduced, maybe the method, maybe the population, maybe the findings, in some way they tried to reproduce. But 50% of those had also tried to reproduce their own experiments. So these are all failures as well. It's not just that they should try to reproduce it and it worked great. 70% had tried and failed to do someone else's work again. 50% had tried and failed to do their own work again. This is why in that poll I ran early on, becomes important. Furthermore, very few of these people who did reproduce someone else's research of their own tried or failed or succeeded, very few published their efforts. Almost no one published failed efforts. A few published successful reproductions. It's very odd. And moreover, very few contacted the original researchers with questions about why their reproducibility efforts had failed. So it just, this sort of adds to the replicate thing. It's like something's going on here, even with ourselves, where we know what we were doing. Hopefully. We're not really telling anyone what we're doing about our efforts to reproduce and we're certainly not asking the people who should know about our efforts to reproduce. In response to replicate and in fact, lots of other disciplines, not just social psychology, other journals, other fields, they sought to find how reproducible are we as a discipline. A lot of people found it's not great. And so there's a lot of more recent activism or movements about open science. There's a discipline called meta science, which is basically the science of reproducing science. It is quite meta. And webinars like the one you're attending right now. Now, the sort of crisis of reproducibility has consequences. A big one that I think you might feel strongly about is that when data that cannot be reproduced is taken up and accepted as true. For example, Alicina 2010 paper was published as sort of proof that austerity would help struggling economies grow by lowering state expenses. And it was accepted by governments that thought, oh yeah, that's what I believe. So I won't try and reproduce it. I'll just accept it because it's published scientific papers like pretty true, right? Accepted it and applied it. And since then, people have not been able to publish it and real world experiments have not agreed with the concepts that the paper suggested should have happened. Beyond that, of course, wasted time, money, research efficiency, lots of things where people say, oh, I've got this brilliant new idea and someone else tries to says, oh, that does sound brilliant. Let me just find out about that. And it turns out not to be a very brilliant idea. And there's a loss of trust. I mean, I won't be the only person who's noticed that there's sort of anti-intellectual or anti-academic kind of sentiment about like, oh, you can prove anything with facts, you know? Well, yeah, there's just some mistrust about what publishing work means, how much it can be trusted, what you can really believe or take away from it. Some real consequences for society. There are also specific consequences for science and scientists, including of course, the wasted time, money, and research efficiency. This all really matters when we're, you know, trying to cling on by the skin of our teeth to research budgets and get things done and get things published. As consequences for a research culture that can become distorted by pressures to publish shiny new things, even if they can't be reproduced, there's lots of trust for specific scientists, which also comes out as reputational damage. So researchers might, you know, if they publish something flashy, can't be reproduced. People either like malign the original research or they align, you know, malign the people who tried to do the reproducibility efforts and there becomes like fighting intention between who do you agree with and whose camp are you on. It's not very helpful for collaborative efforts to achieve things in society. That's my opinion, I'm sticking with it. So let's talk a little bit about what's driving this. Blatant self-citation here. I did an article about how alchemy now. And a big part of alchemy was that you could write down interesting sort of recipes for how you achieved the gold from lead or a new kind of blue dye or whatever it was that your alchemical goal was, but you tried to do it in code. So you sort of used a lot of metaphors and sort of symbolism and like internal aid memoir rather than really clear instructions for someone else to do. And that's still following true, I think a lot of researchers, especially in sciences, like maybe social sciences, I don't know, maybe I can't back that up. It's still true that a lot of researchers kind of want to be the only one with this magical knowledge, that we have the solution and you'll just have to believe us because we are who we are. And that's not necessarily sharing the knowledge, it's protecting knowledge, it's protecting your position as the person with the knowledge. And here's our committees leaping out of a bath. Who gets the credit? A lot of science has a problem with wanting to give credit to the first mover and not to anyone else, even if subsequent research validates it or extends it or applies it in an important new way. So the issue of credit becomes really important, especially with funding and grants and position and status and it's, these are longstanding problems of research culture. Now, there are of course modern problems within research culture. So ignoring alchemy in our committees and all of that, there's problems of how we educate people, like how much emphasis do we put on the value of reproduction within the scientific method? Are issues about hiring and promoting? Do we hire people with all the flashiest publications? Do we hire people with the most reproducible publications? Do we promote people who train their PhD students to produce reproducible work or to reproduce the work of others? Or do we promote people who have the widest number of researchers who out there grab in the grant money? It can be a distortion in modern cultures. While publishing, obviously, someone earlier in the Q and A brought up where are we gonna publish this stuff? Yeah, a lot of journals you'll find will not be very interested in your paper if you are successfully reproducing something that maybe is a little bit old, even if people have accepted it as true and you're reproducing it saying, yes, it is true. They're like, ah, we already know, they don't wanna know. It's a problem. Similarly, funding and post-publication engagement, you know, how the question about how many people have reached out to a research if they were unable to reproduce their results. Some researchers are really pleased with that. They're like, oh, you thought my idea was interesting and you're writing to me and I feel really, you know, noticed this is great. Other people are like, why are you bothering me? I'm really expensive. You know, my time is so expensive and I have better things to do and I've got three grant applications to write my lunch. Get out of here. It's an issue, but a big issue is questionable research practices. Now, some people consider questionable research practices to be just how it works if you look under the hood. You know, you don't wanna see how a sausage is made. You don't wanna see how a research publication is made. Other people say this is soft fraud and it needs to be rightfully exposed for being problematic. Questionable research practices could include anything from not sharing your data properly. I'm not making your raw data available when it is safe to do so. Write through to cherry picking results and sort of massaging statistics and, you know, really problematic stuff. So here, please, you know, move. Tell me if you are aware of how felt pressured to have heard of anyone else doing questionable research practices or soft fraud. Let's just give some examples. You don't have to admit to having done it yourself. This is all anonymous. I mean, right up to falsifying data although that's pretty hard fraud. Let's see. This doesn't, I thought this was supposed to be a Q and A session panel but it seems that something has happened with my mentor meter and it is not allowing me to enter them. So if you want to enter them into the Zoom chat. We can still enter them. So it's your answers, it's a short question so it'll still pop up for you. Okay. Oh, I do have a question here that Joe's shared. Let me just check and see if I haven't missed any more. No, just one, a question. Were those non reproducible studies where an experiment cannot be repeated or where all data cannot be used to repeat the analytical results? All of this, basically the non reproducible studies from the social psychology era, some were like fundamentally flawed as in the experimental methods are highly unethical and we just are not allowed to do that to children. We're not allowed to put them in a room and shout at them until they cry and see how long it takes them. That's that kind of thing, not allowed. Those are unreproducible for a clear reason. Others were unreproducible because, yeah, the data was just inaccessible. You couldn't be repeated because the categories and subjects that are no longer accessible, something that might have been representative of the majority of people living in a place at a time, populations have changed, those people are no longer representative. There's all kinds of reasons why they may or may not be reproducible. But yeah, some of these reasons that you're producing as well could be reasons that those studies were not reproducible in that they were not very honest about the data sources they used. They were not reporting all of their findings and this is one of the distorted research culture issues is that there's a real problem that if you do research and you get a sort of non-result, you find like there's no interesting interactions between this condition and this output. Nobody wants to publish, nothing to see here. Like that's not a very flashy journal article but by not publishing that, someone else might come across the same question and reproduce your efforts, also finding a non-result. You'll have no idea that they concur. Both of you will have wasted time. It's not helpful. Hiding data processing steps which led to a result. This is a very good one. Do you mix two, collapse two categories together? Do you throw out outliers? Things like that. Retrospective control group, that is a very good one. I like that because yeah, we don't always have control groups. Certainly we go exploring data sometimes without a good sense of what the control group would be. And if we find it later in the data by finding who meets this criteria, everybody else will count that. It can be very questionable. Might be the only possible way to do things but you do need to be clear if that's the only possible way. Not including caveats. Yeah, selection bias. Selection bias is a real problem. I know everyone knows it's a problem but the fact that people aren't saying it's a problem and therefore they can't counteract it in the future. P values, tweaked P values. That was a huge part of the replicate is that people were really throwing out individual data points and things so that their P values were certainly significant and that's not allowed. You're not allowed to fudge the input to get the right output. Inappropriate stats test, yeah. You should in theory justify why you make all the choices and decisions that you do, why this test and not that one, why these factors and not those. Not including copies of data collection tools, e.g. surveys. Yes, that's very clear. If you have no idea what they actually asked on a survey, it's very hard to feel robust that their conclusions match your intuitions. That's really a tricky one. Not fully informing participants, that's a good one. Yeah, informed consent is definitely something that is much more recent and therefore we may not be able to reproduce old research because under modern ethics standards of informed consent people would not be as honest or they would not be as, yeah, they wouldn't answer in the same way. Again, implied consent, that's good. Shared method variance, that's absolutely true. Even within really hard sciences like chemistry, not every bottle of reagent is exactly the same as every other bottle, there is variance in there. Sometimes you get a bad one, it happens. Okay, super. Well, it doesn't look like any more are coming in. So these are all really good examples and show that you are aware that questionable research practices out there. Certainly some of these are definitely much more fraud-y than others that they're easier to see as fraud. But even the ones that are well-intentioned, you would be best to be clear about how you got the things, what decisions you made, why you made those decisions that people are not always clear. Okay, so let's move on. So to round up a few reactions to replicate and some of the other things that followed afterwards in other fields, not just social psychology, some people responded by saying, really it's just a misunderstanding. This whole crisis is overblown and the changes that you wanna make, they're too broad brush, they don't take account for the specific things that matter to my field or my discipline. Some people have even suggested that if someone is going to try and reproduce it published research, they should not be allowed to do so without the permission or collaboration of the original author. I shouldered to think that someone thought that was a good idea, but their theory was that it's so hard to be specific in the method sections of journals that there is no reasonable way you can really describe your method properly and people if they want to reproduce your method should come to you and really hash it out in detail. Bonkers to me, but somebody said it. On the other hand, a lot of people have said, let's call for a revolution, let's make soft sciences more like hard sciences. And some people of course have suggested market-based solutions like incentives or apps or journals that only publish reproduction research, things like that. So let me just take a quick reaction from you all. How big a deal do you think this crisis really is? Is it? So on each side is who is responsible for this being a problem? So on the top you might say it's a minor deal, but the original researcher should make the changes that to resolve this minor issue. Or you might think it's a major deal and really the reproducing research has bulk of work to do here. Or you might say it depends on the field in question and how that plays in. Yeah, yeah, this does not surprise me. Oh, very few people believe it's a minor deal. That is reassuring. I'm glad to hear, especially since you are attending this session that you do not think it's a trivial nonsense. That is encouraging. Okay, so I've got some good answers. There's a couple of pumps here. I can see that some people think the original, even if it's a major deal, some people think the fault lies with the original researcher and some people think the fault isn't quite right. The responsibility for making improvements lies with the original researcher and others for the reproducing researcher. The same kind of mixed response applies to you. It depends on the field in question, presumably because different fields, sometimes the original researcher has to do some more work and sometimes the reproducibility, yeah, the reproducing researcher. Good, glad to see that we agree that it matters. Okay, so here's a time for some more discussion here. And I want you to answer questions maybe about that it depends, you know? Do you work in a field where someone has told you, you have to publish every single piece of raw data you get and you're like, that's not appropriate. I deal with personal data and vulnerable adults. It is not appropriate to publish all of that. You know, maybe you want to tell us about that. That's an example of why it really depends on the field. Alternatively, maybe you agree, like social sciences, but they're just too soft, too hand wavy. We need to make them more like hard sciences. Maybe you think the responsibility for making changes, it's not the original researcher, it's not the reproducing researcher, it's the journals. They should do a better job of enforcing standards about how methods sections go. Go ahead and ask any questions here. In fact, you can ask just questions. You can ask questions about anything in the workshop so far, not specifically about. About just the last slide. I do realize I'm speaking quite quickly and maybe everyone's a little bit like overwhelmed and therefore has no questions. It's all right. If we don't, I'll leave this for just another minute. And we got one. As a journal peer reviewer, I've never been asked if I think a study is reproducible, but I have been asked if research is novel. Yeah, this is a really good point. This exactly points to the distorted culture possibly. Novelty is important, but so is reproducibility. And the fact that people are specifically asking about novelty, but not reproducibility does suggest a bit of a distortion there in how journals approach new research to publish. That's a great point. Thanks. I will be working with interviews. So I would expect a reproduced piece of research to draw a lot of the same conclusions, different experiences will come up. Is that worth noting in the methods? That is worth noting. Absolutely. I mean, we all kind of know that if we know anything about how interviews work, but in the sort of discussion and conclusions part, you can absolutely say it's important to note that these are the lived experiences of these specific people. And hopefully they're representative of wider groups within the culture that their conclusions, their sentiments, their experiences are generalizable. It's worth noting that other researchers may not find this, that maybe they're quite unique people. You can say that as both a pro and a con, that other people won't be able to reproduce the exact interviews. But on the other hand, you've captured something very important to Novel, and we can all look at it critically and discuss what does this mean? How do we go about using this information to better society or to learn more about society? Thanks for that. Do you think journals with small word limits need to make authors publish additional notes on methods results somewhere else? Gil's details here are the first to go with small word limits. I absolutely agree. I think journals should really require either supplementary materials in the way of data and code and the status do file. So there are scripts or things like that that the people use to produce their graphics. They should be asked to include that. And if they do not include that, they should be asked to make a short statement about why they have not. It's entirely appropriate that someone would not be able to include original data. But they could say, I've included a short synthetic data set so that you can run my R scripts and get similar graphics. I would like to see more of that. And I do feel that small word limits, you have a nitty gritty real detailed stuff in method sections. It's not super. It's not what most of us want to really spend our time reading. I can see why it gets the boot first. Some data is not very accessible even with reproducible methods. How could we solve this? Well, I guess it depends on the data. But yes, I mean, it's absolutely the kind of thing where you could publish good supplementary resources and say, here's a data set, a cleaned or safe to publish data set. And here's some of my R code files. And here's how you can reference the data and play around with it and show how I got this results and how if you make different changes you get similar results and why I chose the one I did instead of those. It does involve more work and it does involve more interest on the part of the other person. But it's going to resolve some of those issues about trust and reproducibility and generalizability. It is certainly one solution. Absolutely talk to me in the Q and A after if you want to talk specifically about what kind of data you have and whether you're concerned about how to make it more reproducible, we're happy to talk. Do you think, oh, right. So this one's, yeah, this feels like a reproduction of the question we had before. Yes, small word limits, the methods being the really dry portion is going to get the boot. I believe journals should strongly encourage if not require people to publish additional resources. And academic work does not get published. One of the factors involved is the perceived robustness of the data. This is a huge problem in publishing and I think it's one of the reasons that there's a distorted research culture. It's very hard to know the real reasons why academic work does not get published. I've had loads of things returned to me saying this falls outside the scope of the journal. And I find that confusing because how are people ever going to do novel research? I consider my own research by novel, but also reproducible because I make the files accessible. How are people ever going to get novel research published until something is well enough established that there's a specific journal for that field? It's a bit backwards, but I don't know, cynically it could be that they didn't want to publish my research because one of the reviewers has a very different opinion to me, just ideologically didn't like something I said in my paper, or cynically you could think not established enough or not male enough or not wealthy enough or whatever it is, whatever factors you think give someone's prestige and status and therefore more likely to be published in a given research culture. It's very hard to know and transparency about who gets published would be great. I like it, I think it would be good. I don't think it, turkeys are gonna vote for Christmas. Is it required to link to full transcripts of focus groups? I work in a sensitive area and moving words and phrases to enable this would be a huge task. Do you think linking to coding's et cetera is enough? Generally, probably linking to coding's would be enough. That said, there are computational methods to make full transcripts safer. You know, if you use, if you're going by hand and changing things, removing words and phrases that is absolutely a huge task, I do not recommend you do that, but you might look at some computational methods to make it much easier. And if that you think would really make people trust in your research or engage with your research more or make it more applicable, you know, do you think people might find new and interesting things in your data? Then maybe that's worth doing. But in the short term, linking to coding's is probably enough. It's certainly better than linking to nothing. Thank you very much for that. That's a good one. That's a really interesting approach to a unique kind of data that I think most of us would just assume can't be reproduced. Nothing to be done, but you found two ways to do it. One, which seems very manual and heavy and not recommended to do manually. And yeah, the question is if sensitive data is used for a study, could a synthetic version of that data be used to reproduce the study or vice versa? Or does that entirely depend on fidelity of the synthetic data? I would say if you're trying to exactly reproduce the specific, you know, graphs that you produced from your data, that comes down to the fidelity of the synthetic data. Not a bad thing. But yeah, that kind of one for one, like for like, but completely synthetic data would be used to reproduce your entire research. That might be useful if you want to demonstrate not only how you got, you know, what raw data looked like, all the different steps that you took to recode that data or, you know, combine subcategories and things like that. That might be quite useful just as a clear demonstration of your work process. But yeah, maybe just a simple example of a similar-ish, you know, not very one for one, not a pure replica kind of synthetic data set would still be useful to demonstrate, you know, your analysis process. So I think it's, if you're using very sensitive data, a synthetic version would still be useful just as an example case so that people can really see, here's what she got as an output from the data collection. Here's what she did next. Here's what happened after that. Here's what happened after that. Here's the final result. That's still very useful. Joe has shared, I can't stop thinking about that example of transcripts from interviews. How is public social media data different? What would it take to make that data open? Yeah, I mean, Joe is on quite a kick recently about social media data. What are the ethics of using social media data? Like, is it personal? Is it public? Like, how useful is it? He says to ignore his spiral, but no, I won't. I will join your spiral. And it's tricky because you could argue that what people say after they've signed, you know, an informed consent document, they ought to know that that what they say is being used much like people know that what they say on Twitter could be used because they have to sign something saying, you know, I'm over 18 and I agree to these terms and conditions. It's kind of an informed consent. It's not great. It's also very easy to forget that informed consent when you're furious about somebody on Twitter. But yeah, I think certainly if you were to look at lots of individual tweets, but you didn't want to share any real tweets because you were concerned about ruining the lives of the people who would be identifiable from those tweets. You could create a synthetic version and use that in all of your examples. That way you could protect individuals while still being able to talk about the process and the kind of results you found, the kind of things you got. Summer shares that she sees some analysis of tweets. Yeah, I mean, we could look that up and see how they treated their tweets today, hide the names and all identifiable information from the tweets where they just talk about big sort of descriptive qualities of a body of tweets or where they talk about individual things that individual people said. It's quite complicated. There's a lot of ethical implications. That's one of the reasons that being reproducible is important because if we think, well, they only got that result because they did something very unethical, well, we ought to know moving forward. Oh, no, the cat is in the toilet. But metaphorically, it's because hard sciences are not exempt from reproducibility problems. In fact, all of those results about the 70% of people that could tried and failed to reproduce someone else's research and the 50% of people that tried and failed to reproduce their own research included physics, chemistry, biology, medicine, all kinds of things that we think ideally should be hard and reproducible and well documented. After all, biology labs have lab notebooks for a reason to improve reproducibility. Well, it turns out they're not working or at least not working as well as people assume they're working. There was a study in which very high impact journals taking on board this crisis of reproducibility made really clear efforts to ensure that work published in their journal, their high impact journal was reproducible. And analysis of that showed that there was some improvement but that it was not perfectly reproducible. So even people who have all the respect and all the power and potentially, all the money, I don't really know how journals get their money. Even when they were well motivated to make it reproducible, they could not make it fully reproducible. And what's more is lots of people in response to the sort of crisis of reproducibility debates are really resisting. They're saying like, you don't know what you're talking about. Maybe your solutions work in chemistry but they weren't working my field or maybe your solutions will work for students but not for me. I have these kinds of demands of my time, whatever. Nobody likes being told that you're doing it wrong and you want to do it this other way instead. Sometimes for very good reasons because of that other way is irrelevant but sometimes just because people are defensive and grumpy, I'll admit to that myself. So here's some solutions that dig a bit deeper. A lot of people and all of the citations that I got these solutions from are in the sort of lower right quadrant, the one with Mac 2014, all of these articles. A lot of these solutions were repeated across these articles. A lot of people have come up with the same solutions. A lot is better training. We can encourage students rather than trying to do novel, interesting, brand new research from the jump, get them started with reproducing research that gives them practice of established methods, established research questions, established analysis tools, things like that. Also potentially produces something quite interesting. Maybe this has never been replicated before. Maybe it has been replicated several times but yours failed and it's interesting to look at why. Better mentorship, if principal investigators demonstrate good reproducible practices. If we encourage our students or early career researchers if we give them experience doing things, they're more likely to do those things. Bonkers concept, I know. There's the idea that people should budget in their research time, a certain amount of time to replicate themselves or that research teams ought to pair people up so that someone else within the research team makes an effort to reproduce your stuff. Likewise, there are some journals out there and more proposed that publish nothing but replication studies. So basically making it mandatory or at least highly encouraged and rewarding to do the replication stuff. Yeah, someone's recommended. Plus, they have a replication journal. Consolidate or standardize methods, materials and protocols. Now this one's tricky. This is one that works maybe better for some fields and not for others. We're not going to all agree on what's the right software and moreover software changes quite quickly. So something that you used to work in this version doesn't work in that. You have to do it slightly differently and maybe that matters. It's not like reagents in chemistry. We can't, you know, demand that our interviewers be, you know, interchangeable. That's not how it works. But still, protocols, you know, making sure that your interview questions are checked by several people. That's a good protocol. Making sure that your selection bias process has been reviewed by someone outside of your immediate research team. That's a protocol that you could do. I mean, similarly, more time, reward, support incentives. If you're going to require that people replicate their own work or a buddy's work, someone has to pay them for their time. They're not going to replicate their buddy's work for free. Publishing practice, so again, journals, but also, you know, changes to how method sections are done or that articles have to require supplementary materials, things like that. Shared data code, et cetera, even if the journal doesn't require it, you can make it, you know, accessible. You can put up a GitHub repository. You can create a DOI for your repository so that people absolutely have the version of data and codes that you used. They can get it. It's reusable. It's, you know, it's out there. It's a marker. It's a flag saying, this is what I used. You can use it too. Attitude change. This one's a hard one. We've got to stop demanding to be our comedies leaping out of baths. The sole discoverer of something that probably everyone already knew. We've got to collaborate. We've got to value sharing information rather than keeping it to ourselves like how precious. Field generated or field specific solutions. We have to accept that what might improve reproducibility in one field will not reproduce, not improve reproducibility in another field. There are issues to concern that are make some research different than others. But I'd like to hear your solutions. This is always interesting. If feel free to, you know, expand on some of the solutions that I've already suggested here or you could come up with completely new solutions or you could share solutions that you know don't work. Please mark those as known to not work. You know, it just like publishing negative results. If you have a known failed solution you can share that with us. So yeah, tell me a bit about if any of your research teams or your universities or your institutes have imposed anything. You know, have they demanded that you publish your results? Have they demanded that you follow a checklist when writing up the methods section? Have they given you time to replicate someone else's research within your group? Update informed consent collected to enable access requests from certain journals and groups. That's a good one. Yeah, because absolutely when you interview people you want to make sure that they have consent to be interviewed. But you ought to clarify with those people whether you're publishing their interviews in full or in some kind of redacted anonymized form or in some kind of you know, something else you know like the recoded sort of numerical values and forms like that. It could absolutely be part of original informed consent. It could be a secondary stage of informed consent. It kind of depends on your field but that's a really good one. I like that. Do we have any other suggestions for how we might go about starting to fix the crisis of reproducibility? Because we're not going to fix it completely. If I did, this would be a charged workshop. Could finders play a role in imposing, assume that means funders, play a role in imposing reproducibility, e.g. fund smaller reproduction studies after original? Yes, they absolutely could. They could make it a part of their funding budget that someone from outside the team, that they have to pay someone from outside the team to reproduce the studies or that you know, a certain amount of some of the researcher's time has to go on reproducibility. They could again just fund reproduction studies. As far as I know, not many people are funding reproduction studies. There was a couple. There was the University of Toronto where something like that would give $5,000 to people who wanted to publish reproducibility studies. And yeah, that's not nothing. And maybe for early career researchers, that's a very valuable kind of fund to win and they can put their effort there. It's great. My college at University of Exeter are going hard in regards to the reproductive replicability crisis. Pardon me, let me just take a sip. They have workshops to talk about it and encouraged to put everything in OSF. That's Open Science Framework, I think. My first study in PhD project will be a replication of very recent work in my field. This is great. This hits a couple of the points. So the workshops is about training and mentorship. I'm setting an example. Allowing part of a PhD project be a replication sort of is changing the culture of what it means to do novel research. Replicating things can be considered novel research. If you're replicating it absolutely one for one, it's hard to see the novelty there, but it's sort of an accumulative confidence. And the more confidence you get, that's still novel. But if you reproduce almost all of it but apply it to a new population, that's also novel. It's very good. Thanks for that. Maybe we should teach more coding to ensure people are analyzing data in R, Python, et cetera. This means researchers would have data scripts which can be checked rather than hoping people selected the right buttons in SPSS. I like this very much. I would give you a sticker if I could because this is absolutely, I think, one of the best ways in the near term to make clear gains in reproducibility is to share your R scripts or your StataDue files, things like that, to show people the steps you followed exactly in order so they can do it too. Also, it's much easier than trying to write that out in a method section. So it's a win for everyone. Encourage journals to formally withdraw articles found to have used fraudulent data. Until this happens consistently, there's no penalty or no perceived disadvantage. So people just submit and publish with little consequence. Ooh, that is contentious, isn't it? How are they not required to withdraw articles? I mean, I've heard of articles being withdrawn after fraudulent data was discovered. I didn't know that was just a choice some journals could make. That is crushing. But again, yes, this points to the culture, the research culture and the publication culture. There's real pressure to publish or perish thing. If people think like it's fraud or perish, probably a lot of people are gonna use choose fraud. So we need to make that not an option, both because the fraud will not be tolerated, but also because research that isn't shiny and new and totally original and flashback and wisdom is not the only way to get published. So very good point. Thank you for this. Yeah, Joe also agrees on the point about teaching coding as a solution. He specifically says that SDSS has a playbook style report generator, but a lot of people are still using Excel in academia and Excel is extremely hard to document what you've done, what choices you've made. Have you sorted this field? Have you moved these columns the other way? Have you recoded the data? There's kind of no good way to capture that in Excel. Are not very good ways in SDSS. There are much better ways in R, Python, and stuff like that. And of course other things that I've just learned about Atlas earlier today, and I don't know if it's any good, but I assume it's probably better than Excel. All right. Well, I'll give you another quick minute here. If there's no other suggestions or failed suggestions or just further points you wanna elaborate, I'll move on to my big hitter suggestions, which they're not unlike some of the ones you've put up to here. My big perspective is that I want to redefine the problem and goal. 100% reproducibility of absolutely every single thing is not the goal because that will alienate a lot of social science researchers who think, well, I cannot get it 100%. Therefore, I'm gonna completely sidestep the whole thing and ignore reproducibility as a goal at all. Instead, what we want to do is be as clear as possible about what we did and why and how our main findings and conclusions might be replicated or validated or supported or applied to new areas of what it means if they are or are not found to apply elsewhere. Go cautiously. This is another one about reproducibility. 100% reproducibility is not the goal. If someone finds an amazing outcome and they publish it and someone else tries to get it and fails to get it, it doesn't mean that the first people were fraudulent or using questionable research practices. It might just be a fluke. Sometimes that happens. And sometimes you misinterpret the data. It turns out to be a null result. It's still published. It still promotes someone else to do work. And the investigation continues. And maybe you end up with something very valuable at the end of it, but it's not 100% reproducibility of the original study. Let's go cautiously. Be curious and critical about things. Embrace complexity. This ties in with that is that not everything will have a simple answer. Like we're not going to get the same answer every single time. And also embrace collaboration. Don't try and be the first person who does the thing because you're gonna struggle. It's not how science works today. Arguably not how science ever worked. And it's just sort of a historical quirk that we think so many things are discovered by a single person. Strive for open science. So strive to share knowledge instead of hold on to knowledge. And that's for the betterment of society, arguably. And that's a bit altruistic and maybe we're not all equally altruistic, but if we make open science rewarding, then even non-altruistic people can get it behind it and automate the boring stuff. So this is about using R or Python or Stata 2 files, things like that to make it easy to do as much reproducibility as possible. To be as open as possible, to share as much as possible. It does not have to be a tedious extra chore. It could be a bit of extra work, but it actually makes your process much easier and faster because when you go to reproduce your own stuff, it will be so much easier if you think like, oh, I've produced this wonderful chart, but it's not in colorblind color scheme. I'll do it again. If you don't have those scripts and files, you may not be able to reproduce your original chart and that's gonna be a bit embarrassing. So it's for your benefit, it's for everyone's benefit and it looks really cool. When you say, oh, yeah, I've published all my code on a DOI to have repository. I mean, it's cool to make, but then I'm that kind of person. Here are my references. These slides will be available after the talk because I doubt you're gonna want to take a picture of the screen, that's weird. And you're not gonna be able to read all of this all along. Here's my contact details. You can talk to me afterwards.