 So we'll just go ahead and get started in just a second. So on the line here, my name is David Meller. I work at the Center for Open Science. I work on journal initiatives and our competition to reward researchers for pre-registrations. That's obviously the content of this webinar today. Also on the line is Courtney Soderbergh. Courtney is our statistical and methodological consultant. She's basically our stats wizard here and leads our training efforts at the Center for Open Science. She'll be available throughout the webinar to help answer questions and also at the end as questions arise too. Courtney, feel free to stay on line or meet yourself or go away whenever you see fit. Sounds good. Also available here if you want to peek in Alex Dehaven. Also work with the journal and funder initiatives here at the Center for Open Science. So here at COS, our mission is to increase the rigor and reproducibility of the published literature. And the main tactic we use to achieve that mission is to make science more open, more reproducible, to make every part of the research workflow transparently clear to the entire research community. So our main strategy for achieving that is to align ideal scientific practices, scientific values with the actual day-to-day rewards and incentives that individual researchers face in the publications, grants, and promotion structure that we all face. So ideal scientific practices to openly share all the available evidence, whereas that's day-to-day reality, individual researchers are more rewarded for secrecy, for keeping something hidden until it's ready to be revealed. Likewise, ideal scientific practices are to be motivated by knowledge and open discovery, as opposed to being motivated by self-interest and competition between individuals. An ideal scientific practice is to consider all new evidence, even when it contradicts something that we worked on before, as opposed to the more day-to-day rewards that we all face, having a vested interest in one's prior work or prior claims. In an ideal scientific world, you publish a relatively few number of large studies, and the evidence presented in those studies would be complicated. There'd be null findings and messy evidence, just like reality is messy. On the other hand, work is more publishable, and you get more publications with many small studies, with fairly surprising and clean results. And the take-home message for all this is that incentives for individual researchers are focused on getting something published, getting something out there. That's what we're rewarded for, not forgetting it right. And the way to make something more publishable is to find those clean findings, find those surprising, statistically significant and clean findings. And the best analogy I know of to explain how that's really feasible with any sort of data questions for a study effort is Gellman and Locan's Garden of Forking Paths. So what this represents is basically any research question followed by its analysis, followed by its testing. So you start with a very straightforward question. Are these two variables related to each other? And that sounds like a very specific hypothesis. Does fx affect y? Does, you know, do the gender names affect the potential danger of a hurricane? Whatever your question is, it starts out fairly specific sounding. But in reality, as you start the analysis path, each individual question you have bifurcates the number of analyses you have. So to use the, when you're constructing variables, you use the mean, the median, or maybe some more complex index. If you exclude outliers, if so, what's your rule for excluding them? Do you control for time or money or height or socioeconomic status, et cetera, et cetera? You've got many, many different decisions to make from going to that fairly specific sounding hypothesis or prediction to the actual analysis you make. So one of those, again, creates a new hypothesis, a new analysis plan. And after a fairly short number of those analyses, you get, or those decisions, you get exponentially larger, dozens, hundreds, or maybe even thousands of different possible analyses. And each of those decisions, you know, in hindsight is perfectly justifiable. But in reality, many analyses were conducted. And by chance, a couple of them are going to be statistically significant, clean sounding and therefore publishable, even though the entire body of evidence might be, might be messier than the one path that was chosen by all those justifiable decisions in hindsight. So what is a pre-registration and how can pre-registration address these issues? A pre-registration is a time-stamped read-only version of your research plan created before the study. And we talk about two different types of pre-registrations. If you just include the study plan, that will include the hypothesis, the data collection procedures, and the variables you collect. One thing I'm going to be talking about for the remainder of this webinar is a pre-registration plus an analysis plan. So that's when you include the specific statistical model and the inference criteria that you're going to use to make a decision as to your hypotheses and predictions. So the pre-registration and the pre-registration with an analysis plan solves different problems that I'll get to in a few minutes. But again, everything from here on out just assume I'm talking just about this one here on the right. So what problems does a pre-registration address? One is the file drawer issue that only a small percentage of work that is ever conducted ever sees a light of day in the published literature. That garden of forking has the issue. There's a lot of flexibility in data analysis. Most of it goes unreported even though a lot of it was conducted. And that p-value hacking can be more fully specified. They can be addressed with a fully specified analysis plan in your pre-registration. The last major problem is hacking. Creating hypothesis after results are known. So if you create a hypothesis after you see the data, what this results in is basically circular reasoning. You've got a small limited data set. You start looking through it. You create a hypothesis with that as you're looking through it. And then you use that same hypothesis in that same data set and you've created a violation of the assumptions of the test. So this bit of circular reason is bad. Ultimately, what a pre-registration does is it makes it much more clear the difference between exploratory analyses and confirmatory analyses. Let me describe what those are. When you're confirming your results, this is what we normally think of as traditional hypothesis testing. The results of this should be held to the highest standards of rigor. The goal when you're conducting confirmatory hypothesis testing is to minimize false positives, those type one errors. And when you are conducting confirmative analysis confirmations, the p-values are interpretable. You create your hypothesis before reaching the data and the p-value of .05 or less makes sense in that context. Unfortunately, a lot of the work that's done is actually in the context of discovery or data exploration. And this is totally reasonable, but in fact violates the laws of statistical analysis. The purpose of the discovery research is to push knowledge into new areas, find something that's unexpected, find unexpected differences or relationships between variables. And the goal here is really to minimize false negatives. You don't want to miss some unexpected discovery. You don't want to miss the next penicillin. But this is a different context of discovery and when you get a p-value using this type of data exploration, digging through a data set, any p-value that comes from that is pretty much meaningless. Unfortunately, presenting those exploratory results as confirmatory results increases the published ability of the work, but that it comes at the expense of the credibility of the results. And normally individuals are, as I said, rewarded for presenting exploratory research as more confirmatory work, and a pre-registration keeps the line between the two very discreet. Anything that was pre-registered in your pre-registered analysis plan is your confirmatory analysis. Anything else you do, any other decision you make after starting to go through the data set after a pre-registration is exploratory analysis and should be treated as such. So here's a typical workflow that we recommend when you're going between these confirmation and exploration stages. You collect a little bit of new data, you use that data in the discovery phase, you dig through the data set, you look for unexpected relationships. And once you have something that you think is solid, you find a relationship, you find an analysis that comes with that magical p-value, 0.05 or less. You then pre-register that analysis that you just found, that hypothesis that you just created with that data exploration step. Then you use that to collect some new data, and using your pre-registered plan, you use that pre-registered plan for your confirmatory hypothesis testing. And then, you know, the cycle can continue. You can create a new pre-registration, collect some more data. And if you don't have a strong hypothesis going into it, you can be explicit about that in that pre-registration you just created. Or whatever was specified ahead of time, you know, can then be used for confirmation or if not exploration. A slight variation on this thing is what we call splitting a data set or taking a holdout sample. So this starts again with a pre-registration of whatever strong ideas you have going into it. You collect some of the data set, and then you split the data, randomly split the data set into two parts. And with half the data, you keep it in an Al Gore lock box. You keep it secret, put it on a different hard drive, put it in different folders, don't allow anybody to look into it because that's what you're going to use later on. But use the half of the data that you're using for exploration to start digging through, looking for unexpected trends, figuring out the best ways to construct your variables, figuring out what IVs or DVs look most interesting. Do whatever you want during that stage with the data set you're playing with. And then once you have something you think is worth sharing, that's exciting, that's unexpected, it's significant, then create a pre-registration at that point. This is the real key pre-registration right here on the right. If you want to create that pre-registration, then you are free to uncover that half of the data that you had covered earlier, and use those analyses that you just created in your discovery phase in that next round, that confirmation phase. Maybe the results will replicate as you found in the discovery phase. Maybe they won't. Then go ahead and start another round of discovery looking for things that were unexpectedly significant to be used for the next round of data collection. So why pre-register? Well, the main point, the biggest reason to pre-register is that you add rigor and credibility to any claims that you made. That's the number one reason to conduct these pre-registrations. If you want to add certainty, if you want to add rigor to the hypotheses that you are showing are significant, a pre-registration is the way to do it. You also, there are other reasons to establish a timestamp record of your ideas so you can establish priority that you have laid claim to this idea early on. It helps you remember exactly what your a prior hypotheses were. So that sounds a little bit silly, but you know, honestly, as you're going through a data set, as you are starting to analyze it, you start out with a pretty good idea of what your hypotheses are, but really that hypothesis is that very specific analysis plan that you created. And that's the exact hypothesis that you should consider as your a prior hypothesis. Anything else is easy to ship slightly and fool yourself into thinking was an a prior hypothesis when it wasn't established beforehand specifically. And it helps you create your next experimental design. So as you're going through a data set, you conduct the registered analysis. Anything else is exploratory and you find that, oh, well, the effect is significant if I control for X or Y or remove outliers based on this rule. I wonder if that's really important. At that point, you've got a very strong case for what the next level of what the next round of data collection should be for and what the next round of your experimental manipulation could be. And this one's a little bit tentative, you know, it could save time. It's a testable hypothesis. Hopefully we'll get some more empirical data on that as time goes on. But you're more likely to spot errors when you do some of that legwork upfront. So we need to start writing down exactly how you're going to analyze data. That can affect part of the experimental design that's easy to overlook if you leave those questions towards the end. Every question that goes into a pre-registration, every decision that goes into making a pre-registration is a decision that has to be made sometime. So by putting it ahead of data collection, you're not adding to any time. You're just rearranging the workflow a little bit. And ultimately it might save time if you catch errors by creating this pre-registration. So, pre-registration is a great thing. We think so. We think that we want other people to give it a try. And so we have this competition funded by the Laura and John Arnold Foundation to give $1,000 prizes to 1,000 researchers for publishing the results of their pre-registered research. And the point of this is really an education campaign to encourage to get people to try it out, to see if they like pre-registration, to see the benefits of the process for themselves, and to use that pre-registration all throughout the research workflow to see its benefits. So those $1,000 rewards are carrots to just try it out and see if it works for you. We also know that most researchers have never created a pre-registration before. And so we have a workflow on the open science framework to create a pre-registration, to walk you through what needs to be included to create a pre-registration for your study and its complete analysis plan. So if you go to COS.io slash prereg, you'll get to this page, information about what pre-registration is, information about the contest, and a link to begin your pre-registration. That will bring you into the open science framework, the OSF. And this is a screenshot of my account. It gives me several options. I have draft pre-registrations, and so it's giving me the option to continue working on one of my drafts. I have several projects on the OSF so I can create a pre-registration for something else I'm working on the OSF. But if you're a first-time user on the OSF, and this doesn't make any sense to you, you'll only be shown the option to start a new pre-registration. And I'll jump you right into the workflow to create a pre-reg. We have this notice as you go through to describe some of the rules about the contest. One thing that we do do is check each pre-registration before it's registered, before it becomes a permanent copy. And we just do a superficial check just to make sure that it is a complete analysis plan, that it's a fully specified research plan. And we say on here that you'll hear back from us in 10 days. We're about to change that to one or two days. We almost always get back to somebody in the same business day. So we're going to update this language because our checks are getting very efficient, and we want to make sure that this isn't a barrier to anybody creating a pre-registration. So you should know that we get back to you almost always within the same day, maybe two business days. And that's going to be updated soon. If you have any other questions about the rules, I'd be happy to answer them. But mostly it was created to make sure that it's a fair and legal way to run a competition for this education campaign. And then it dumps you right into the form. So in the pre-registration form, it walks you through exactly what's required. Title, research questions, hypotheses. It lets you describe how you're going to collect your data. One of the big questions for pre-registrations is can you use existing data? And the question is maybe a resounding perhaps. It really was thought of and was designed for before data collection starts. But as long as you haven't or the researchers who are analyzing the data haven't seen the data that are coming through it, it's perfectly acceptable to use a pre-registration anytime before you start analyzing the data. And the purpose of this is to give transparency and to allow researchers to conduct work that would be impossible to conduct if they had to collect new data. You know, data sets collected by the government, et cetera, et cetera, are possible to, you don't have control over when that data is collected, but you still could benefit from pre-registration. An example of this is the American National Elections Studies Institute, and they collect survey data before and after national elections in the United States. And actually there's one going on in Italy also, and there are both competitions for those. The election research pre-acceptance competition has a very similar form and a very similar framework for that data collection effort that is coming from that survey data. As you're collecting your data, we do ask that you specify what your sample size is expected to be and how you came to that number and what you're going to do to stop data collection. What we're looking for here, you know, it doesn't require that you use a power analysis to justify your sample size, but it does require you to be transparent about how you are collecting your data. Perhaps you only have money or time for a certain sample size, so just be transparent about how that came to be. And if you're collecting data on an ongoing basis, be transparent about when you're going to stop data collection. It could be after you get to 100 participants. It could be after you run out of money. It could be after you run out of time or some combination thereof, but simply be transparent about that. What we don't want to see here is basically that you collect a little bit of data, see if it's significant, collect a little bit more, see if it's significant. As you start doing that, that's really data exploration that, again, breaks the rules of statistical inference. Courtney later can give some more detailed answers about how you can do that in a way if you have specified key values ahead of time, but I won't go into that right now. Here it gives you the option to explain exactly what variables you're going to be collecting in your study, what, if it's an experimental design, what conditions are you manipulating, what variables are you measuring. And those measured variables could be independent or dependent variables. And if you're constructing any variables or making an index, so perhaps your measure of happiness is an average of 10 different responses to 10 different survey questions. Or if you're an ecologist, you might be using a specific biodiversity index, Shannon's Biodiversity Index or something, just specify how those complex variables are going to be created. The analysis plan is the most detailed part about the registration. And this is the part that typically is not done until after data collection, but this is the part where we really want to research to spend the most time, specifying in advance how the research questions are going to be analyzed. You do have the option to upload an analysis script, and that really answers basically all the questions in the registration. If you've got that analysis script, you've got all of the decisions about how the data are going to be analyzed ahead of time and any changes from what was registered to what happens later is very transparent in the sense that if you make a change to your script, it's a pretty obvious change. So you, it's very distinct when you go from that confirmation to exploratory stage. At the end of your registration form, you've got the ability to preview what you've got listed, go back and edit it or to submit it and have it registered. You've got the option at that point to create a registration without entering the competition, or you've got the option to submit for review. And again, that takes about a day. And what we do is just check quickly to make sure it's a fully specified analysis plan will look at your stopping rule will look at your inference criteria just to make sure you filled out what was required for eligibility into this competition. And get back to you right away. And about half the time there are a few comments about half the time it's registered right away. The last thing you'll see is a ability to make your registration public immediately, or you'll have the ability to enter it into an embargo period so you can make it private for up to four years, after which you'll become become public. When you have your registration on the OSF there are a couple of things to note about it. One is that you'll have a new URL your project will have this short osf.io slash a couple of numbers and letters, and that's a persistent link that you can use to site. And that will will stay forever. I'll get to withdrawing registrations in a minute but but even if you end up destroying your registration for some reason, that link will always resolve. And so it is a persistent link after you have a registration. And then the form that you filled out with all the nitty gritty details about your research plan is right here. This is listed under a privilege challenge and here's just an example of what it looks like. And you might be able to see in the background is just a slight watermark read only read only as a reminder that this is a registration that cannot be edited. All right, so when you does come time to answer your question to write up your article. There are a couple of main points to remember. One, make sure you include a link to your registration so that that link, as I said is a persistent siteable link, you can create a DOI that that will include that information and also that's one of the capabilities on the OSF. You do have to report the results of all registered analyses and so if you said you're going to run 100 t tests you have to report the results of all 100 t tests. Or whatever you registered has to be written up regardless of whether or not they're significant. And then anything that was unregistered has to be marked as such you have to transparently state that something else was added to to the study for this particular data collection effort. So there are different ways to say that you could say unregistered you can say exploratory, you can say hypothesis generating analysis but just has to be somehow distinct that that these other analyses were not part of the registration. So here's an example of a recently published. study that was pre registered on the OSF this is one of the first that are going to be eligible for the $1,000 prizes. And this was published in the Royal Society Open Science. And it was a pre registered direct replication of a study published in 2013 and you can see in the results that the authors described, they're registered analysis and the results of the registered analysis. And then they go through all of those different analyses and the results of all of those and they have a section of unregistered exploratory analysis so this makes very clear what types of analysis they thought of after looking through the data. And then the reader the editors the peer viewers are left to weigh the relative evidence of registered and unregistered analysis as they see fit. A common question that we have when people submit a pre registration is what if what happens if I need to change my research plan. Well there are a couple of different ways to answer that. If you want to change your plans as you see the incoming results that that kind of breaks the rules of registration. So basically what you need to do there is report what you said you were going to do and anything else that looks more interesting or more tantalizing feel free to report on that but just make sure it's distinct and noted that it's not registered. If you need to make chance make changes before seeing the data. There's a possibility you can you can in the open science framework withdraw a registration and create a new plan but that's only valid again if you haven't seen the data yet or if you haven't started data collection. So that is a possibility but but only under certain circumstances. If you have questions about as you're going through the research plan and you want to change something or something looks wonky that needs addressing feel free to email us. All the correspondence once you get a enter the competition goes to prereg at cos.i home. If you haven't been working with a journal mentioned what a registered reported in a few minutes. But if you haven't been working with a journal and they have provisionally accepted your research plan for publication that's called a registered report. Basically need to contact the editor and ask if this change would affect their results so they're or their decision. There are times when it might affect the decision if you said you're going to collect 100 participants and only collected 50 they might come back and say no you need to go back and get the full sample size. If you said you're going to sample fish at 100 meters and there were no fish at 100 meters and so you have to go to 50 meters that might be a justifiable plan that doesn't affect the inferential value of your statistical tests. And so those are examples where editors have come back and said yes or no you may or may not change your research plan. But the editor or perhaps the peer reviewers will be the ones to answer that question. One common fear that comes up frequently is will this make my work less publishable. And what this does do what a progression does is it makes presenting any spurious findings as rigorous confirmatory tests that makes that process harder. So if you've got spurious findings. It is hard to present them as rigorous confirmatory tests with the registration so in that sense perhaps it can make it less publishable. However there are strong counter arguments of registration that extremely strong indicator of rigor for any confirmatory tests. And a major benefit is that it does help highlight those unexpected those new testable hypotheses to you for the follow up study or to future researchers who are working in the field. You know in furthermore we expect that work that's not pre registered will soon be viewed much rightly so much more skeptically as work that was unpre registered if you don't register the work will be viewed much more skeptically. So the risk is a little bit hard to quantify is it riskier to register or not to register. And why not let the tiebreaker be airing on the side of rigor and better science. So, so those are some of the issues that arise when when you're wondering about whether or not this is a risky proposition. Does pre registration work well yes there's a couple of key indicators that pre registration is effective. So this one particular study looked at the time sharing experiments for social sciences that tests registry and what happens in this. This organization accepts questionnaires except surveys from social science researchers, and they, and they found them out they use their large pool of respondents to measure the responses on these questionnaires. And so this created a registry basically of all the conducted work, and what these researchers did was compare what was submitted to tests in the questionnaires. Compare that's on the horizontal axis compare that to what was actually reported in the articles the outcome the research articles. If every questionnaire reported the results of if every article reported the results of every outcome variable in the questionnaire, they would all fall along this line. If any articles happen to have added outcome variables those would be dots represented above this line. Fortunately that didn't happen in this case. Any articles that didn't report the results of everything that was in the questionnaire shows up below this line. So right away you can see that not everything that was submitted to tests ended up in the published literature. That could be fine it could be an unbiased sample of what's getting out there in the literature. But fortunately the researchers looked for that and compared what was in the reported tests on a median p value of 0.02 statistically significant median effect size 0.29. 62% of the reported results were statistically significant compare that to what was unreported. The majority of tests were unreported and the p values of the median p values was much higher 0.35 the effect size the coins he was half. And only about 25% were statistically statistically significant so you can see that there is a strong bias in what was presented in the published literature. And therefore the published literature does not represent the full scale of reality compared to what was what was actually conducted. Another example of registration working is in 2000 registration became required by law for large clinical studies. So these are basically pharmaceutical companies running clinical trials. And in this case it was cardiac studies looking for effects of different treatments or different drugs. The vertical axis is the relative risk. So that gray line at one shows that there's no difference the relative risk with or without treatment is one is the same. Anything above that line is the relative risk is higher with the treatment so it's a little bit more dangerous. Anything below that line is less danger or less risk or some sort of beneficial treatment from the beneficial effect from the treatment. And you can see before registration was required by law for these types of studies that 57% of the outcomes were statistically significant that positive results were 57%. And then after registration became required by law only 8%, only two of those studies that were conducted in the later time period were statistically significant showed a positive effect after registration. So it shows a marked difference in what was reported before and after this went into effect. Again, registration is not required by law from most research but this is the clinical sciences. A common fear about registration is that somebody will take your ideas and scoop you and take your research plans, use it themselves and beat you to the finish line. Perhaps that could happen but a pre-registration does protect you in several ways. It is a date stamped documentation of when your claim was made and when your research plan was created. By the time you've created that pre-registration you're already basically ready to start data collection. So you're ahead of anybody else who might be looking through the registries, looking for a good idea to use in their own research. And then finally we do offer up to four years of embargo. And so that, you know, if you have a four year head start on anybody else that's pretty solid evidence that your ideas won't get scooped in a reasonable time frame. And about roughly about 40% or so of people who are entering the pre-registration challenge choose to make their pre-registrations public immediately. So that's a little bit higher than I kind of expected initially. I was expecting, I don't know what it was. I should have pre-registered my expectation. But that's pretty high, I thought. Something that comes up frequently when we discuss pre-registration is the fear that it's easy to cheat. So it is, theoretically, easy to cheat and make your work seem more rigorous than it is. You can make a pre-registration in quotes after you conduct the study. So conduct the study, do a whole bunch of p-hacking, you know, just start looking for those data exploration steps. Once you find something significant, make that a pre-registration and then present that as evidence that you found something rigorous. Or another way to cheat would possibly to make dozens of different registrations all with slight variations and only deciding the one that worked. Well, both of those, the answer to both of those is that yes, you could do that. However, those steps make creating fraud much harder to do. But even more importantly, they make creating fraud much more intentional. So most of what I've been talking about, most of the rationale for these open science practices aren't to prevent fraud because there's always going to be a way to lie, to conduct fraud. Most of the rationale for open science practices is to help you keep honest, to help yourself be honest to yourself that didn't come out right. But just to make sure you don't fool yourself with those subtle biases as you're analyzing a data set. And an unexpected benefit is that it makes fraud more explicit. And so if you are intending to deceive, you know, p-hacking isn't the way to do it because everybody is affected by that, honestly. But if you really want to cheat the system and start doing some of these things, it really crosses that line. It makes it pretty obvious that you have malintent. And so there is that other side benefit. But again, the primary benefit is that it helps yourself to make sure you keep your biases in check because we all face them. The final point that I'd like to make is an extension of the pre-registration process, and that's called registered reports. So everything we've talked about so far can be done in a very typical research cycle. You conduct your pre-registered study as you do, as you see fit. You submit it to a journal. You make clear what was registered and what was not, and then you hope for the best. With the registered report, you submit the pre-registration to a journal. So you basically have an introduction and a method section in justifying the importance of the research question, why this question has to be answered, and justify your proposed methods and their ability to address that question. The peer review process will evaluate those. The peer review process will be very critical in terms of making sure that any null results will be interpretable. It's more likely to include manipulation checks, some sort of verification that your experiment worked, even if you didn't get significant findings. So those types of checks are much more likely to happen in that first stage of peer review than would normally happen because if it passes this stage one of peer review, the journal is obligated to publish it regardless of outcome. So they want to make sure that the research was conducted as rigorously as possible and that any null results are true negatives and not just some sort of failure of the experiment on manipulation. So again, that first stage of peer review evaluates the necessity to answer those questions and the ability of the methods proposed to address them. So the accepted proposals are guaranteed publication regardless of outcome. Right now there are 43 journals currently accepting registered reports. You can learn more at cos.io slash rr. I should have a pirate trip right there, but don't. They span the sciences fairly broadly right now. I showed an example just a few minutes ago, the Royal Society Open Science conducts registered reports for a very wide spectrum of scientific discipline. A lot of the other journals are in psychology, social psychology, a couple of neuroscience, a couple of infancy journals. There's a pretty wide range. The largest recent cohort I mentioned recently in political science with that upcoming American and national election survey data set that will be released this coming April. With that, I would like to simply say thank you. And I would be happy to answer any questions you can use the Q&A feature at the top of your window at the top, I think. You can either enter the chat room and Courtney and I would be happy to answer any questions that come through question. It's a couple of whole bunch of questions coming in now. Oh, good awesome questions. Okay, let me I'll answer them as we see fit Courtney if you see any that are right up your alley feel free to tag them I'll just start from the top here. Any tips for working with students on purchasing their projects. Any tips for working with students. So this entire process was basically created as an education campaign. So the process really lends itself well to students it's the most similar analogy to this upper registration that most people are familiar with is masters and PhD proposals, because those tend to have the more detail than any other grant proposals or any other research plan. And so the process really lends itself very well to the registration process as long as there are enough of those analysis details, you know, submitting a, any sort of thesis proposal as a illustration is a really is a very good idea. It helps you know instill best practices early on. And it helps, especially those design issues that are very likely to come up. As you start thinking more concretely about the data analysis. So I don't have any specific tips besides saying yes, you should do it. But it does lend itself well to that. Yeah, so can an embargoed registration be unembarrassed. So for example you submit as a preprint. Yes. If you create an embargo for four years, and you're ready to unveil it after one yeah you can make it public and be sooner than you expected there's a little option when you go back to the OSF project there'll be a button to make public. And that allows you to make it public sooner than you anticipated. So yes, how do you see the process of proposing and creating research is changing in the next five years. You know one of the things we're working hard to expand that list of registered reports there's currently 43 we are currently working with some funding agencies to submit a joint process not only submitting to a journal but submitting funding a funding request for that proposed research study along the same time as you would submit it to a registered report for consideration. So I think those types of trials over the next five years will will grow. There will be more examples of journals that accept registered reports so in principle acceptance before results are known. And, and I know for a fact that at least one and I know that there are there's interest and several other funding agencies for partnering with these journals to grant funding to an accepted registered report. We have a complication on that workflow about registered reports because both the funding agency and the journals have to agree that it's worth funding and publishing. But the rationale for doing that is the same as for doing it for a registered report and so we have, we do expect that to be an option to, to more researchers in the next in the next coming years. We have entered the period challenge we have about 500 registrations that have been accepted for the competition. We have about another 500 that have registered their research plans on the OSF but not entered into the competition. And then there are other ways to register on the OSF. You can, you know, just write down your answers in a form, you're in just a Word document uploaded to the OSF. Create what we call an open-ended registration that doesn't have that long form associated with it. We do have a similar form as the as predicted.org registration system. We have a couple of other research forms on the OSF to guide people through different types of registrations. And there have been, so the pre-reg competition has just been alive for a year and these other types of registering have been around for a varying amount of time. So there are several thousand other registrations on the OSF that may or may not qualify as a pre-registration, you know, one that's conducted before results are known. Is there, in terms of uploading, is there a content of file size limit? I think five gigabytes per file is the upload limit. There's no limit to how much content you can store on the OSF. But I believe that, is it also five for per registration? I think there's another five gigabyte limit for the amount of content that can be created in a registration. So roughly five gigabytes, but you can, but there's no preset limit to how much storage you can use on the OSF. And that five gigabyte limit is basically a technical limitation. It becomes much harder to deal with larger files than that. How specific do the analysis plans need to be? For example, do you need to specify the survey items, manipulations, transformations, etc. So the answer generally is it needs to be quite specific. You don't have to give the entire survey, but you do have to describe how the variables are going to be created from any survey that you do use. So let's say you have 100 questions on your survey, and one of your IVs is going to be an average of a subset of those that are created on a Likert scale or a Likert scale, depending on who pronounces it. You do have to describe how that you would go from the survey to the variables that are going to be used in your analysis. You don't have to give the full survey. A lot of people do go and end up and attach the survey document to their registration. But if you describe the exact process that you'll use, you don't have to include the survey. Manipulations and like manipulation checks, you should include manipulation checks. And if you don't remember to, you can describe it as unregistered when you write up your final report. I think Courtney will have more to say on this also, some of the important things to remember on the analysis plan. Yeah, so to follow up on what David said, the pre-reg challenge does require quite a specific pre-registration. However, pre-registration like many things is kind of a continuum. The more you pre-specify, the more you cut down on those research or degrees of freedom, and the more confirmatory your analyses are. However, a lot of times if you're pre-registering for the first time, this is something that is new to a lot of researchers. And so just kind of like any new habit or new skill, trying to do kind of the perfect one to start with may be a pretty high hurdle. So I know many researchers who are just trying pre-registration for the first time. Some of them will go in and try the pre-reg challenge even though it requires kind of more specifics because there is that support around it. But other researchers will try just kind of a simple pre-registration to start with, like specifying just their intended sample size. Maybe the broad statistical tests they're going to do, their IVs and their DVs, but they won't go and do some more specific things like transformations or a follow-up test or something like that. And so even though they haven't gone the full way and done incredibly detailed pre-registration, even making those few choices up front do decrease the research or degrees of freedom they have for that particular study. And so I would say think of it as a continuum with the more you specify leading to fewer potential degrees of freedom there. Yeah, and the more you specify, the more rigor you add to it. So there's a relationship there. Is it typical to submit a registered report at the same time as your pre-register on the OSF? So if you would like to submit a registered report to a journal for in-principle acceptance, you can do a couple of different things. But one decent workflow would be to submit to the journal first, because they are going to have very substantial comments that the peer reviewer will obviously give expert content advice on what you're going to be doing and the validity of the research questions. And so the likely changes that are going to come through that are pretty substantial. Once you have that in-principle acceptance, then you can submit it to the, you can create a pre-registration. If you want to enter the competition, just attach that document that you have had in-princibly accepted. All the other fields, you can just say, see attached document. That's okay. And then pre-register at that point. If you create a pre-registration and then go to a journal, you're likely to have an update required to your plan. And that's fine. You just create a new pre-registration based on the feedback that you get from the peer review process. And then what that will do is create a trail of the evolution of your research plan as it goes through different stages of peer review. That's perfectly fine. There's no ethical problem with doing that, but it's just, you know, a couple of different steps to take into account. But if you're thinking of getting it submitted to a journal that's conducting registered report, I suggest that you go through them first. All right. So somebody asked if P values are meaningless for exploratory research, what statistics evidence can be used for exploratory research? So I'll kind of answer this in two ways. For some types of exploratory research, there actually is a way to adjust the P value to take into account that exploration. So for example, some of you may be familiar with post hoc comparisons. So if you've done, you know, in ANOVA and then after seeing the data, you decide what comparisons you want to make. We have things like, we have ways to adjust the P value for the fact that that was done post hoc. Usually you'll have to adjust for all possible comparisons. Whereas if you made that assumption a priori, you would only have to adjust for the number of comparisons that you'd a priori plan to do. The reason that that's kind of a special case of that is because you kind of know how many possible things you could have looked at. With most exploratory analyses, though, it becomes difficult to kind of quantify how many possible comparisons or different models could have been run when you don't know how many variables somebody has when, you know, you don't know how many transformations they could have done. And so that's why it becomes difficult to interpret the P values because you don't know how much the false positive rate was potentially inflated and so you don't know how much to correct for that. In those types of cases, what can be useful to look at is things like the direction of the effect, potentially effect size. So just getting an idea of, is there potentially something there? What direction is it going in? Does that kind of match up with, you know, maybe theories that are out there? Or if it's a theoretical work, you know, does this kind of have a valid sense to it? There are also things you can do, for example, some sort of capable analyses where you're doing a bit of work to kind of test how many different ways can you slice the data and does the effect keep showing up in those different slices. It's not quite like a holdout sample like David was talking about because you are using the same pieces of the data as you're training in your test set. You're just basically holding out some of them on each go around. But it can be a good option if you don't have the option for true holdout samples to kind of see how robust your results are across different facets of your data. So hopefully that made sense. Next one would grant funders fund pre-registered replications of non-registered exploratory studies. So two answers to that. Right now, most granting agencies are not funding pre-registered replications for any work. That's something that there are ongoing discussions within funding agencies about necessity and the value and how best to go about doing that. There are a couple of research agencies I think in the Netherlands who are funding. I think the National Science Foundation equivalent in the Netherlands are funding some replication studies, but many, most I think are not funding that. And a little cynical point. You mentioned non-registered exploratory studies. I would cynically say that a lot of the work that's out there is essentially even if it's being presented as confirmatory evidence is non-registered exploratory findings. So the reproducibility project in psychology demonstrated that a lot of the work is very difficult to reproduce. There are ongoing reproducibility projects in cancer biology and social sciences that are, I won't spoil the news, but it's also hard to replicate work in many other fields because there are a lot of confirmatory work being presented as exploratory work being presented as confirmatory work. So right now the short answer is no, but it's an advocacy and an education campaign that the community is working on. Is pre-registration open for researchers from developing countries, especially from Africa? Yes, they definitely are. We want this to be a whirlwind competition. We do have some limitations in the number of countries we can send awards to and that's two, because of some of the complications of running an international competition. So there are like 15 or so countries where you can't participate in the competition, but the tool is available to researchers worldwide. And if you're in one of those countries that we can't send money to, then I'd be happy to work with you in other ways to get involved with open science. The short answer is yes, it's open to everybody. Everything in the OSF is available to anybody across the world. A most appropriate time span to keep the project in the embargo period? Well, the maximum is four years. The minimum is obviously zero, public immediately. So in between that is really up to you. I would recommend if you want to keep it private while you're doing the study to make an estimate for how long it'll take you to collect and write up the data. Then maybe tack on 20% and make that what you think the embargo period should be if you want it to be private while you're working on it, but otherwise it's completely up to you. So, so far, we have uploaded Word documents with all the necessary privilege info on the OSF and made this public. Is there a downside to this approach compared to going through the successive steps on the OSF? So, no, essentially, they reached the same goals. The purpose of the steps that I showed was to demonstrate creating a pre-registration for somebody who doesn't know what needs to be included or to guide somebody through workflow of creating a pre-registration. There's no substantive difference between that and uploading a Word document to the OSF and then registering your project. And that just again creates that frozen read-only version of your project. There's no scientific or statistical rationale for going one way or the other. We do want to make the competition available to everybody and so even if you're uploading a Word document, you can still fill out the form and just say C Word Doc, C Word Doc, C Word Doc for all the different answers. And that gets into our administrative backend portal so that we can make sure you're eligible for the competition because we do want to give out $1,000 prizes. So, I encourage you to use that form even if you have a workflow that works for you. When you create a registration on the OSF, you'll have different options of using the form or just using an open-ended registration. And the only way to get one of those prizes is to use the Pre-Rich Challenge form. But otherwise, you can create a pre-registration in any way that you think is fit. You know, you could theoretically create pre-registrations in many different ways, which I didn't go into today. Oh yes, the Italian constitutional referendum pre-acceptance competition. Let me just show what this is and I can't copy and paste. Let's Google it every time I work. Italian research. There it is. I resisted on it recently. So just Google Italian research pre-acceptance competition. Let me see if I can get this link out to everybody in the workflow. This uses the Pre-Rich Challenge form, and so you'll be eligible for anything based on that. But you'll also be using a data set that is forthcoming based on survey data that was taken before and immediately after the recent Italian referendum. So let me put this link into the chat window. Hopefully this will work. Yes, hopefully everyone saw that. So if you are interested in Italian political science, use that. The date for that registration competition is February 8th. So that's when the data are going to be released from that particular data set. The American National Election Survey is expected to be early or mid-April. Let me give that link also. ERPC. Okay. And I'm in trouble with the typo with the chat box here, but erpc2016.com. What can I do if someone scoops my project? So if you see something that you know was your analysis and was your specific research plan that's showing up later, if you've got strong evidence that it was legitimately scooped, not just somebody working in a similar question in a similar field, you know, the editor that published that work should be aware of what ideas might have been stolen. So you've got strong evidence that these research plans were created on a specific date just by using that link that you were provided with on the OSF. And if it looks like there was plagiarism or it looks like there was a very specific analysis that was taken from that and used without, you know, your knowledge. That becomes an ethical consideration that the editor of the peer reviewers, the institutional, the office of sponsored research at universities might get involved. So those are the, you know, people who have a say on the ethical compliance of, you know, stealing research plans basically. Are there any best practices for registration, any guides for people getting started? Yeah, you know, I think just going through the form from COS.io slash pre-reg is the most practical step-by-step process. You know, the biggest questions in creating a registration is what needs to be included and then in the registration and then what needs to be included in the final article. And that's just going to take experience. And so the step-by-step walkthrough on the OSF that you can get to from COS.io slash pre-reg. I'm not working now. Okay. It is the most practical technique I have for sort of a step-by-step guide for somebody getting started. Are there accepted techniques or pre-registering studies without needing to trust the registrar? If Courtney has an opinion on that. Yeah, so if I interpret this question right, I think what this means is can you pre-register by kind of publicly stating this information a priori without putting it into some sort of registry that's run by a third party system. If that's not the question, you know, please follow up. So the point about a pre-registration is to kind of make the researcher accountable to kind of what they plan to do beforehand and make it so that they, the reviewers and the readers can go and check this information. And so you want that information to be someplace that's always going to be accessible and is always going to be kind of stably accessible. So, and if it's taken down, there's a notice that something was there that was taken down. So for example, you could put a pre-registration, you know, on your personal website. But that wouldn't typically be considered kind of a true stable pre-registration because of the fact that, you know, your personal website, you might take it down all of a sudden or you might move schools and just kind of forget to transfer things. And so best practice really is to put it in some sort of registry, whether it be run on the OSF or something like clinicaltrials.gov if you're in the U.S. and are doing clinical work. Or some other specific groups have their own registries, for example. And so that's why you typically want to put pre-registrations in these trusted repositories. If I pre-registrate my study now, I've just finished writing my protocol and plan to begin testing within a couple of months. Do I qualify for the competition? Yes. How do you determine who wins? So we've got a preset set of dates for when we're going to review who has published their research. But basically when it's been accepted through publication, send us an email. And if you qualify, you'll be sent a thousand dollar check at the next approved time. So every six months, we're going to be sending out prizes. So it goes for the next until the end of 2018. So there are a couple more years that this competition will be running. And we wanted to make it fair for everybody, even those who are conducting fairly long studies. So you've got a couple of years left, so please do enter the competition. We do actively want people to enter this. And we do reserve most of the awards towards the end of the competition, just to make it fair, so that people who are conducting longer studies or people who are have longer peer review processes, which isn't up to them, obviously, make it fair to everybody who's involved. So yes. Is it worth pre-registering exploratory research? Yes. I think Courtney has a strong opinion on this too. So when you think about a pre-registration, it kind of has two parts and serves two purposes. One is just the explanation of what the study is and what variables are going to be collected and what the study design is. And then there's this analysis plan component. The analysis plan is really there to decrease these researcher degrees of freedom. But by pre-registering the design itself and putting that in a repository that will eventually become public, what that helps to do is decrease the file drawer effect. So this effect where studies that aren't published are really difficult to discover and find. And so even if work is completely exploratory, it can be useful to pre-register the design of the study, so that down the line other researchers can figure out that that work was done, even if it doesn't eventually get published. So if I see in a registry a couple years from now that 20 different research teams did exploratory work, looking at some particular group of variables and none of them got published, that might be useful for me to know if I'm thinking about starting up research in that area. Additionally, what it can do is if you say, here's my study, here are the variables I'm going to collect, this is completely exploratory. When you then go to publish, you'll also have that check where you have that pre-registration that says, hey, remember, everything I plan to do is exploratory. And so that'll kind of help as a memory check for you when you're analyzing your own data to say, right, all of this was exploratory. Have there been issues publishing pre-registration work in terms of journals copyright laws? So something has been shown to be published that are publicly available already. So that's, I know that's a common fear with like a pre-prints, for example, where it's already been put out there. And so the journal is not going to publish it because it's been in a public domain. I've never heard of that with pre-registration. It would, in my opinion, be a gross misuse of the copyright laws because of the point of registration is not to disseminate information widely in a community, which is the point of pre-prints and publications, but rather to make the advert or to the study. So I can't promise that it would never happen, but it would be extremely misuse of the copyright laws. That's coming from somebody, not a lawyer. So I can't promise it won't happen, but I've never heard of it happening. What are your recommendations for dealing with advisors that won't involve themselves with pre-registration? Courtney. So there are a couple of ways you can handle this. If your advice, so there are a couple ways you can handle this depending on, you know, how not involved your advisor wants to be. If it's the sort of thing where your advisor says, you know, I don't want to deal with the process of pre-registering, but I don't care if you do, then you can put together a pre-registration, you know, yourself for the challenge and just not force them to review it. There are some advisors who are kind of actively against the process of pre-registration, in which case they might be very against you actually submitting something to a public or eventually public registration repository. And that becomes a little bit more difficult because as a grad student or a postdoc, it puts you in a little bit of an awkward position where you want to engage in these behaviors, but that could make your relationship with your advisor a little bit. Difficult. A little bit difficult, a little bit testy. In that case, I think one thing that you can do is kind of do a private pre-registration. So, you know, right out on your computer or on your own hard drive, kind of your pre-analysis plan. If you're not uploading it to a repository, that's not going to help kind of other people verify that what you claim is confirmatory and exploratory really is, but it will at least serve as kind of a memory check to yourself. A lot of these research degrees of freedom or p-hacking behaviors happen because it's really easy for us to trick ourselves because we have memory issues, we have these confirmation biases. And so if you do write these things down and just keep them, you know, for yourself, that can be, you know, a very small step. It won't go as far as putting it in public repository because nobody will be able to verify it, but that is kind of a middle ground. If you want to engage in this behavior, but your advisor is very much actively against actually putting it in a public repository. I should just mention we are over time. I'll stay until the cows come home answering questions, but you know, feel free to leave when you need to, but I'll just keep on going through these questions. On the side of the review process, besides journals, is there any interest on the part of OSF to work with conference organizers and peer review committees towards codifying different types of peer review practices such as double blind or single blind review processes. And this can address different biases from incoming members or it can also address biases with sexism and research. We don't have a strong policy statement on the cost and benefits of double blind or single blind review processes. We think it's a good thing to explore and I think it's a good thing to test out, but this isn't something that we have yet come to consensus on about what's the best way to address these biases in the peer review process. Our strongest policy document is the top guidelines. If you want to take a look at what those are right now, cos.io slash top and those get into our policy positions about data sharing, registration, replications, register reports. There's discussion, there's an ongoing discussion about ways to make the peer review process more fair and transparent and ways to make pre-print policy more fair and transparent, but those are sort of ongoing discussions and we're waiting to see what the most effective method at adding rigor to science is. That's basically all I know about that right now. So I'm going to skip that. I think Courtney already answered those two questions about registering a pilot study like exploratory research. And after that to register the main confirmatory study. So yes, there is benefit to that. But we've already answered most of that. What changes in the peer review process when you have Bayesian statistical analyses? Courtney may have more to add to this, but basically the registration process is the same. You specify in advance how you're going to collect and analyze the data. Obviously Bayesian inferences are more robust to some of these biases, but not to all of them. It is possible to selectively report or selectively analyze data in a Bayesian framework. Courtney, do you have anything to add to that? Yeah, so I would say some of the types of information you would include in the different pre-registrations is going to be the same, just like what is the general design of the study, what variables are going to be manipulated and collected. But one thing you might see in a pre-registration that uses Bayesian analyses or base factor analyses is information about what prior distributions are going to be set. If you're doing base factor, what cutoff criteria are you going to use for kind of determining no weak, moderate, or strong evidence for and against the null? Or if you plan to run a couple of different prior distributions to see how robust the effects are to different choices of priors, you might pre-specify that so it's clear if there was full reporting of all the priors that were chosen. So much of the information might be the same, but you also might get some differences in the information you would want to include because of the different types of information that you're going to use to specify the Bayesian analyses. The competition will be live until the end of 2018, so the next six months or years submit your pre-registrations if you want a reasonable chance of getting one of the $1,000 prizes. Everything, all the infrastructure on the OSF that we're using now is going to remain there and we're expanding the registration system on the OSF. So the next couple of months, you're going to look out for new ways to register. So all that infrastructure will be there forever, but the prize competition period will end at the end of 2018. If you make your own document, your pre-registration document and set it to private on the OSF, will it be subject to the embargo period? So if you don't register and if you have a private project on the OSF, that never becomes public unless you say it should become public. And everything that you do on the OSF is there's a log record of when I was added and removed so you can share that when and if you have, but it's not a proper registration because it doesn't ever have to become public. We will email everybody who registered for this webinar to make sure that you get a link to the recorded version. So check your inbox for that. Additionally, all our webinars are recorded and uploaded to our YouTube channel. So if you ever lose the link, you can go to YouTube and look for the Center for Open Sciences YouTube channel. Usually give us about a week or so to, you know, edit and upload the video. But if you don't see it after that, feel free to, you know, bug us about it. Will the OSF remain free of charge permanently? Will it be fees down the road? We have no plans for any fees down the road. We're a nonprofit technical organization supported by private and public foundations. Our long term plans will involve some sort of shared governance model where you hope organizations are supporting the infrastructure that we're building and the policies that we're promoting. But there's no plans for user fees. And I should also note that if we do, unfortunately, run out of money next week and have to turn off the lights, there is an endowment to make sure everything that is persistent on the OSF stays there. Even if we do run out of money, there is money to support that persistent copy of the OSF. If I don't belong to university, can I participate? Yeah, if you don't, there are no credentials required. You just have to have an OSF account. You have to be able to get research published in a journal. So that's the requirement for that. I do not have information on the competition in Spanish. Yes, do I have a template to fill information for formats or initial work to begin the study? So if you go to the peerage challenge, COS.io slash peerage, COS.io slash peerage, you should be higher up. There are these templates on steps to take to earn the prize. If you prefer to design your research offline, you've got these Google Docs and Word Doc templates that you can just answer all the questions and use that as you wish. And then once you're ready to use the OSF, you can copy and paste as you wish. So there are templates available also available at COS.io slash peerage. Do you offer a basic course to begin the process or make use of the best practices for getting better in this process? Let me think. All of the training materials are found on the OSF, OSF training resources. And that's a project that Courtney maintains. Let's see. I have to work for that. I will share that. Let's see. There are currently any training materials specifically for pre-registration outside of kind of the template itself, which walks you through the process. But if you have particular questions, if you're trying to do a pre-registration for a study, you can always email us at contact at COS.io and we'll be happy to answer particular questions that you might have. And here's a page with links to a lot of our training resources, COS.io slash stats consulting. So it's under services on the COS page. Do I have a FAC to check things out about this? Yep, it's COS.io slash peerage, hashtag FAC, FAQ. There's COS.io slash peerage slash FAQ. Do we have so many of your staff that can make this something in Spanish? No. But if you know of anybody who would be interested in conducting a Spanish language webinar, email us contact at COS.io and I think that's a worthy endeavor. So I think we could work with you to figure out a way to make that a reality. So we do not have anybody, but that's something that we could work with you on. Link with information about the legal and copyright issues and send it to OSF. Eligibility criteria. So COS.io slash peerage, the link for eligibility criteria has the big points here and a link to all the terms and conditions for the competition is right here. So that's, sorry. So this is the complete eligibility criteria with rules and requirements for what to get each step along the process. And that is accessible from again COS.io slash peerage. We offer a step-by-step bachelor's level to promote the $1 million dollar registration challenge at the webinar will be recorded and presented for you to use. And I think besides that and the registration form itself are the two best ways that are sort of BS level, bachelor's level work. Yes, so everything that's on OSF is open source. And you can find it. Let's see here, github.com slash, I think it is center for open science slash OSF.io. So all the code that goes into the open science framework is free and open source. If there are things in there that you think should be included. If you're a developer, if you see anything in there, you are welcome and encourage you. We do occasionally have non-employees take a look at and suggest improvements or new ways to add content into different parts of the code base. So we do encourage and welcome that. Alright, that was exciting. I think I've answered all the questions. If there are any more questions come through, feel free to email us. Happy to answer any more questions over Twitter email. I think everything for getting into contact with us is basically right here. Pre-reads.io.