 Hi, thanks for joining us for EsmarComp for 2023. My name is James Pustiebsky, and with my colleague, Megajoshi, we'd like to share one piece of one of our ongoing projects, developing new tools for investigating selective reporting in meta-analysis of dependent effect sizes. The method that we'll be demonstrating today is a cluster-level bootstrap for a Vave Hedges-type selection model. Now by selective reporting, we mean the phenomenon where statistically significant affirmative results are more likely to be reported and therefore more likely to be available for meta-analysis compared to results that aren't statistically significant or aren't consistent with theoretical expectations. So this happens as a result of biases in the publication process on the part of journals, editors and reviewers, as well as because of strategic decisions on the part of authors. Selective reporting is a big deal. It's a major concern for research synthesis because it distorts the evidence base available for meta-analysis. Kind of like a funhouse mirror distorts your appearance. It leads to upward biases and estimates of average effect sizes and complex biases and estimates of heterogeneity, all of which makes it all the more difficult to draw conclusions from the synthesis. Now, a meta-analyst might say, we've already got tons of tools available for investigating selective reporting. Why do we need more? We've got graphical diagnostics like funnel plots, tests and adjustment methods like pet P's, selection models, P-value diagnostics like P-curve and P-uniform. The problem is very few of these methods have been extended to handle dependent effect sizes, which are a really common feature of meta-analytic data. Dependent effect sizes crop up all over the place. For instance, when primary studies report results on multiple measures of an outcome construct or measure effects at multiple time points or involve multiple treatment groups compared to a common control group, and they also come up in meta-analyses of correlational effect sizes, where you may draw more than one correlation coefficient based on the same sample. If you've done meta-analysis work in education, psychology or other social science fields, you probably recognize that dependent effect sizes are really, really common. Although we have good methods available for handling this sort of dependency when conducting summary meta-analyses or meta-regressions, there are very few methods available for investigating selective reporting that can accommodate dependent effect sizes. And what's more, if you use existing tools that don't account for dependency, you can get misleading results like two narrow confidence intervals and hypothesis tests that have inflated type one error rates. So we wanna explore a rough and ready pragmatic strategy for investigating selective reporting while also dealing with dependent effect sizes. Our thought is to fit a regular selection model as implemented in the metaphor package and then use a cluster level bootstrap to account for the dependency. And this uses standard methods that are implemented in the boot package. For demonstrating this method, we'll use data from a recent meta-analysis by layman and colleagues that looked at the effect of the color red on attractiveness judgments. The data include 81 effect sizes from 41 studies. And so we've got effect size dependency issues to deal with. Here's a final plot of the data. A basic random effects meta-analysis indicates an average effect of about 0.2 standard deviations and substantial heterogeneity of about 0.32. The final plot definitely has some asymmetry to it. So you might well be concerned about selective reporting bias with these data. Hi, I'm Mega and I'm going to go over how to cluster bootstrap selection models. To implement a cluster bootstrap using the boot package, we need a function to fit the selection model which takes in a dataset with one row per cluster. The function also has to have an index argument which is a vector of row indexes used to create the bootstrap sample. And then it can include any further arguments. Our dataset has one row per effect size and potentially multiple rows per cluster. There are at least two ways to turn this into a dataset with one row per cluster. We can use the group by and summarize functions to create a dataset with just cluster level IDs. And then we can merge it with the full data by study to get the effect size level data. Alternatively, we can use the nest by function to nest the data by cluster. And then we can use the on nest function to recover the effect size level data. Here's a function to run a selection model. Inside the function we first fit at an RMA uni meta regression model. We then use the cell model function from metaphor to fit a selection model. We don't need standard errors to be calculated in this step, so we're skipping those calculations here, which speaks things up. We then compile parameter estimates as single vector. Further, we use the possibly function from per to handle errors. By passing run cell model through possibly, it will now spit out NA in case there are any convergence issues or any errors when running the selection model. Here is the completed fitting function called fit cell model. First, we take a subset of the data based on the index argument. This generates a subset of the data based on re-sampled clusters. We use the on nest function to get the effect size level data for those re-sampled clusters. We then run the run cell model function here. And we include the run cell model function inside the big function here to ease parallel processing. The dot dot dot here refers to the contents of the run cell model function in the previous slide. With our example Lehmann data set, we create a nested data with one row per study and fit the selection model using fit cell model. These are the point estimates from the three parameters selection model. Now we can bootstrap using the boot function which takes in the nested data set and the function to fit the selection model, arguments for steps, number of bootstrap replications and any further options for parallel processing. Parallel processing is really useful here because we get 2000 bootstraps in under one minute. To get bootstrap conference intervals, we can use the CI function, specify the type of confidence intervals and specify the index of the parameter that we want. Here the index of one is for the overall average effect size and we get the conference intervals for the overall average effect. Based on the selection model, the overall average effect has gone down a bit from 0.20 to 0.13 and the CI indicates that there's quite a bit of uncertainty around it. Here we get the conference intervals for between study heterogeneity and for the selection rate. So this has been a very quick demonstration of cluster bootstrapping for cluster bootstrapping and selection model. This cluster bootstrapping technique is interesting because we could in principle apply cluster bootstrap to other models or methods to investigate selective reporting. We are currently studying the performance of bootstrapping at three parameters selection model and initial results suggest that the conference intervals, like the one we showed you, have reasonable coverage. Future directions include exploring other resampling methods such as fractional weighted bootstrap and turning the workflow that we presented here into a more user-friendly function. Thank you so much for your attention and please feel free to reach out with any questions. Thank you.