 Hi and welcome back to ESMAConf 2023. This is the fourth presentation session on quantitative synthesis part one. As always, you can ask questions via Twitter by following at ES Hackathon or using the Slack channel if you've registered for the conference. Presenters will be answering those questions after the session as well. Up first is Antonina Dilgarakova, who's going to introduce a meta-analysis of pre-clinical studies with complex data structure, a practical example of using a multi-level model to account for dependencies. Hello everyone, my name is Antonina and I'm happy to present our study as a practical example of the use of multi-level model to account for dependencies in meta-analysis. You have probably heard about the problem of low reliability and reproducibility of preclinical findings. Much progress has been made in addressing translational failures, but reports of irreproducibility in preclinical research continue to be published. Meta-analytical research is therefore very important for improving the quality of preclinical studies and needs to be done correctly. Although there are many useful tools to help you at every stage of the process, there are challenges that not everyone knows how to overcome. One of these is the complex data structure that implies dependent effect sizes. This violates the main assumption of a meta-analysis. Non-independent effect sizes can be broadly categorized as multivariate and nested. When the same subjects are used to calculate multiple effect sizes, the data structure is multivariate. This is the case for example when there are multiple treatment groups and a common control group or when multiple outcomes are measured in the same subjects or when an outcome is measured at different times. If the data are nested, you will usually get several effect sizes within a cluster such as a study, laboratory or a geographical region. Both cases are quite common in preclinical studies and in both cases the effect size estimates are correlated. If ignored, this can result in misleading statistical conclusions. Multilevel modeling and robust variance estimation are the most reliable approaches for dealing with dependence. Importantly, they make no assumptions about the specific shape of the sampling distribution of effect sizes and you do not need to know the dependent structure of the effect size estimates. Variance covariance metrics. I now present a practical example of the application of these methods in preclinical meta-analysis. I'll report on our recent work which included certain controlled studies with 21 experiments. We were interested in the experiments that tested the effects of anti-migraine drugs on two outcomes, ongoing and evoked neuronal activity in the animal migraine model that we use in our laboratory. Our main aim was to estimate the overall effect size to get sample size calculations in future studies. To refine the model, we also examined how methodological features affected the results and it identified potential sources of bias. The data extracted for the meta-analysis had a multi-level structure, as five studies reported up to four experiments, while several experiments were conducted with the same control group. To account for the likely correlation of effect sizes, we built a three level model with robust variance estimation and went through the standard meta-analysis steps such as examination of hydrogenity, subgroup and sensitivity analysis, estimation of potential publication bias and risk bias assessment. I will then explain in detail the methods used to assess all of this except the last, the risk bias assessment which was performed at the study level and is therefore similar to that in univariate meta-analysis. We calculated mean differences as an effect size measure and to fit the three level model, we used the RMA-MV function from the metaphor package. The pre-calculated effect sizes are specified by the EE argument, the corresponding sampling variances via the V argument and the desired random effect structure via the random argument. Here we have a mixed effects model. One random effect corresponds to a study and one corresponds to an experiment within a study. The resulting RMA object was then wrapped in the robust function which constructs a cluster robust estimate of the variance-covariance matrix of the model coefficients based on a sandwich type estimator and then computes tests and confidence intervals of the model coefficients. This includes by default a small sample adjustment, the CR1 estimator implemented to improve the performance of the method when the number of clusters is small. Note that the model includes two variance components for between and within study heterogeneity. This makes it possible to examine the distribution of the variance at the level of an experiment and a study. Guidance on calculating heterogeneity for multilevel and multivariate models can be found on the very useful metaphor package website using the link below. In our study, the overall I-squared which is a sum of between study and within study heterogeneity was 90%. About 59% of the total variance was attributed to between study heterogeneity and 31% to within study heterogeneity. The remaining 10% was sampling variance. The extent to which methodological features explained the observed heterogeneity was assessed using method regression. To do this, we simply run the full loop with the same model but now using the modes argument to specify the values of each moderator. In the study protocol, we specified seven factors, one of which the gas used for ventilation accounted for significant variability in the results. The high heterogeneity in the dataset may also be caused by extreme effect sizes, while the pulled effect estimate may be highly dependent on a single experiment. For the models fitted with the RMA family of functions, outliers can be detected using different types of residuals and influential cases based on Cook's distances. For multilevel models, there is an option to specify a clustering variable. Here we assumed that standardized residuals greater or lesser than 1.96 indicate experiments that do not fit the assumed model that is represent outliers and Cook's distances greater than 4 divided by the total number of data points, the common cutoff indicates influential cases. Lastly, the publication bias. To the best of our knowledge of the available tests for publication bias, only Egger's regression can readily be combined with multivariate modeling methods. We therefore assessed the potential publication bias by estimating the asymmetry of funnel plots of the experiment level data using visual inspection and Egger's regression test. To do this, we fitted again a multilevel meta regression model with a measure of effect size precision standard error as a predictor and used robust variance estimation to handle dependence. These approaches showed a clear trend of increasing effect size with decreasing precision. Overall, we hope that this practical example of using multilevel model has demonstrated the simplicity and feasibility of performance standard meta-analysis steps using the metaphor package. The data and our mcdown scripts are available on the OSR website. Please let me know if you'd like to have a look. Thanks for your attention. Thanks very much, Antonino. Up next we have Rebecca Harris with preeclampsia and pregnancy and offspring blood pressure, a multilevel multivariate meta-analysis of observational studies. Over to you, Rebecca. Hi, everyone. My name is Rebecca Harris and I'm a PhD candidate at the University of Wollongong in Australia. Today I'm going to be talking about a multilevel multivariate meta-analysis that we conducted with the metaphor package in R to evaluate the impact of preeclampsia on offspring blood pressure. In studies pertaining to cardiovascular health, systolic and diastolic blood pressure are key outcomes of interest and are both usually reported in primary studies such as randomized trials and cohort studies. And these outcomes are characterized as dependent. And what we mean when we talk about dependence is that the outcomes are correlated, in this case highly correlated. And this means that knowing information about one outcome reveals some information about the other. When conducting a naive pair-wise meta-analysis, dependent outcomes cannot be pulled together because doing so would violate the assumption that effect sizes are independent and from a different sample. And this is because the model doesn't distinguish between the different data structures. So the actual structure of the data is as shown here, where systolic and diastolic blood pressure are measured in the same participants. But in a naive pair-wise meta-analysis, the statistical model assumes that the effect sizes are from a different unrelated sample. And this is true even if the outcomes are different. And this is problematic mainly due to the artificial increase in the sample size. And this leads to confidence intervals that are much too narrow. An appropriate approach to combine multiple dependent outcomes in a meta-analysis model is through multilevel modelling, which accounts for the dependence between the outcomes by specifying in the statistical model how each effect size is nested in the included study. So as shown in the illustration here, if a study reports both systolic and diastolic blood pressure, we can add another level of variance at level two. So this is basically an extension of a random effect two-level model with an additional tau squared variance component for the multiple outcomes. So now we have three levels of variance. The correlation coefficient between systolic and diastolic blood pressure is not required. And this is because this is advantageous because correlations are often not reported in the original study. So we used a case study of preclinemia and of spring blood pressure to illustrate an example of systolic and diastolic blood pressure in a multilevel meta-analysis. So a bit of a background about preclinemia, it's a health condition during pregnancy characterized by new onset hypertension. So hypertension that occurs after 20 weeks of gestation and occurs alongside maternal organ dysfunction or fetal growth restriction. Although the cause is not well understood, observational cohort studies have provided some evidence that preclinemia is associated with an increase in blood pressure through childhood and adolescence. There have been some previous meta-analyses on this topic which use standard pairwise univariate meta-analysis. And this means that they dealt with the dependence between systolic and diastolic blood pressure by conducting separate analyses for each outcome. Another important component of the structure of the data in this particular case study is the longitudinal nature of the effect measures. But because participants in the studies were followed for a long period of time, there were often multiple measures of both systolic and diastolic blood pressure in each sample. Most of the previous meta-analyses selected one time point to extract data from, but there was one previous meta-analysis which included multiple follow-ups that they were independent and therefore violating the independence assumption. Longitudinal data can be included in a multilevel meta-analysis by just considering that the different time points as different outcomes also nested within each sample. So this brings us to our current project and our aim was to conduct a systematic review and meta-analysis to compare blood pressure of offspring brought to pre-climatic and normative pregnancies. We registered our protocol in Prosperum and to search for articles we searched in PubMed, Sinal and Mbase databases from their inception to January 31st 2022. And we searched the citations of included cohort studies and previous reviews as well as conducted forward citation searching using Google Scholar. For our selection criteria participants could be any age including from infancy to older age and they were included if they reported systolic or diastolic blood pressure on a continuous scale and this could be millimeters of mercury but could also be percentile scores or standard deviation scores. We conducted tidal and abstract screening using abstractor and this was done by two review authors in duplicate independently and to assess the within study risk advice we used the Robin's e-tool for observational studies and this was again conducted by two review authors in duplicate. When looking at observational evidence it's very important to take into account that there could be confounders which affect the relationship between the exposure and the outcome. So this is the graph showing the confounders and mediators of interest and the main confounders of interest were maternal smoking and drinking during pregnancy, education or socioeconomic status, maternal parity, age, ethnicity or pre-pregnancy BMI. We also made the important distinction between confounders and mediators because mediators lie on the causal pathway so they could because they occur after development of PE it's not appropriate to adjust for these. So therefore we only pooled results from studies which had adjusted for all relevant confounding factors and did not adjust for any mediators. For our analysis we chose to pool HG standardize mean differences to enable us to pool effect sizes which were expressed as percentile scores and standard deviation scores with those from other cohorts which were expressed in millimeters of macro and because effect sizes were adjusted for confounders using multiple regression models we also included a correction factor when calculating these to account for the fact that the variability around the mean difference decreases as the number of variables in the regression model increases. So this is the code we used to conduct the multi-level meta-analysis and also on the right you can see the data set and how it was set up in long format. We first created dummy variables for systolic and isolate blood pressure to be used as moderators in the model and this enabled us to get a pooled effect for both outcomes separately. And then to run the meta-analysis we used the rma.mv command available in the metaphor package and we specified the effect size and variance shown here and we used the mods argument for the regression to obtain different effects for systolic and isolate blood pressure and importantly we used the random argument and this was to specify how each effect size was nested inside each cohort and we used restricted maximum likelihood estimator for the level two and level three variability. So these are the results from our study selection there were 2,423 unique reports identified from database searching and 12 unique reports from citation searching and after full text screening there were 55 reports of 42 42 cohorts however like I mentioned before because we only pooled data from the cohorts which were adjusted for confounders there were only seven cohorts included in data analysis. So here are the results from our multi-level meta-analysis as shown in the forest plot. So here we have the systolic blood pressure and the standardized mean difference and the diastolic blood pressures standardized mean difference pooled. So there were only small increases in both systolic and diastolic blood pressure and when we re-expressed the standardized mean differences in two millimeters of mercury we had a 1.69 millimeter of mercury difference in systolic blood pressure and a very small 0.65 millimeters for diastolic blood pressure. So here are the results from our risk of bias assessment overall there was a low risk of bias in most domains but there was some concerns due to missing data mostly due to the long follow-up of the cohorts and this led to all studies being rated as some concerns overall. The main interpretation of our findings is that offspring born to pre-eclectic pregnancies do have highest systolic and diastolic blood pressure when compared to norm-attensive pregnancies but this difference was much smaller than previously reported due to the fact that we correctly specified how dependent outcomes were nested in the cohorts and it's not clear if these small differences in blood pressure are significant enough to infer a clinically meaningful difference in adverse cardiovascular outcomes in later lives such as heart attack and stroke. Here are our references and thank you so much for watching. Thanks very much Rebekah. Up next Wolfgang Wichtbauer will be talking about location scale models for metronalysis using the metaphor package. Hi there my name is Wolfgang Wichtbauer from Maastricht University and I will be talking to you today about location scale models for metronalysis and how you can fit these types of models with the metaphor package. So to get started let's first talk about a standard meta analysis and one of its main goals namely to estimate the size of the average effect and for this we typically use a random effects model which says that for every study we have an observed effect for each of our case studies and what we want to estimate here is mu the average effect but there are two sources of variance first of all there's epsilon sub i this represents sampling variance and these variances these v sub i's these are known this represents variance in the estimates around their true effects while an observed effect is not going to be equal to its true effect because of the sampling variance and these are heteroscedastic by construction that is we have bigger studies which tend to have smaller sampling variances and then we have smaller studies with bigger sampling variances and then we have another variance component here tau squared this denotes heterogeneity this represents the variance in the underlying true effects and this is assumed to be homoscedastic that is it's a constant it doesn't vary you can easily fit such a model with a metaphor package which of course you have to first install if you don't already have it and let's look at an example and this is one of my favorite examples the meta analysis on the effectiveness of the bcg vaccine and for each of the studies included in this meta analysis we have the number of tb positive and negative cases for the vaccinated and the non-vaccinated groups and we will use this information to compute log transformed risk ratios and the corresponding sampling variances so let's first load the metaphor package and then we can take a look at the bcg dataset which includes 13 trials and here we have the number of tb positive and negative cases in the treated or vaccinated group and the number of positive and negative cases in the control group and there are a couple other variables included in this dataset one of which the alloc variable we will take a look at this in a little bit now we can compute the log risk ratios with the es calc function we have to specify the type of measure rr stands for log risk ratios then we have to give the function the information so it can compute these log risk ratios and once you run this the dataset includes two additional variables yi those are the log risk ratios and the negative value here represents a lower risk of infection in the vaccinated group and vi those are the sampling variances and which we can see a sometimes large or sometimes smaller depending on the size of the studies and the amount of information that is provided by them finally we can fit the random effects model with the rma function we just give it the effect sizes and the corresponding variances and here are the results so we get our estimate of tau squared we get our estimate of mu the average effect and the corresponding test whether this is significantly different from zero and a corresponding confidence interval on this slide you can see a fairly typical force plot showing the results of the individual studies and at the bottom the results from the random effects model namely the estimated average effect and the confidence interval and since this excludes zero we know that this is a statistically significant effect and showing that the vaccine is indeed effective on average for reducing the risk against tuberculosis but there is a fair amount of heterogeneity in these results we can see this here up top this is the estimated distribution of the true effects based on mu hat the average effect and tau square representing heterogeneity so there might be circumstances where the vaccine is even more effective or where it is possibly not even effective at all to explore such heterogeneity we can use meta regression where we use one or multiple moderators or study characteristics to examine if they're related to the size of the effects to give a simple example say the studies fall into two groups the randomized versus non-randomized studies and we can represent this as a dummy variable which we code zero for the first group the non-randomized studies and one for the second group the randomized studies the analog variable in the data set actually provides information about the method that was used for assigning participants to either the treatment or the vaccine or the control condition although it makes a distinction between three different forms of allocation we will simplify a bit and create this dummy variable by coding this variable as one if the method of allocation was random and zero otherwise now we can include this dummy variable in our meta regression model and we are especially interested in the results down here the intercept represents the estimated average effect for studies not using random allocation and the coefficient for random represents the difference in the effect for studies that do use random allocation compared to those who do not and we can also test if that difference is significant statistically significant which is not the case here let's ignore this though for the moment and look at this forest plot which shows the studies ordered by whether they did not or whether they use random allocation and up top here we now see two distributions the estimated distribution of true effects for studies that do use random allocation versus those that do not what we see here is that these two distributions have different means but they have the same tau square the amount of heterogeneity within the two subgroups is assumed to be the same so again tau square is assumed to be homoscedastic but this assumption may simply not be true which brings us to the location scale model the first part of the model looks pretty much like a standard meta regression model but the difference is now that tau square has a subscript i so tau square is allowed to differ across studies and in fact what we have now is a model for how tau square or log transform tau square differs across studies namely as a function of one or multiple study characteristics so here now we can make a distinction between predictors for the size of the effect these are so-called location variables and predictors for the amount of heterogeneity these are so-called scale variables and they may or may not be the same and of course again you can have multiple such location and or scale variables and I recently extended the metaphor package to also fit those types of models to illustrate this let's extend the previous meta regression model by including random also as a scale variable so like before we get information about the estimated average effect for the non-randomized studies and how different the effect is for the randomized studies but in addition to this we now get information about the amount of heterogeneity for the non-randomized studies and how different the amount of heterogeneity is for the randomized studies as noted earlier heterogeneity is expressed on the log scale for this model so to get the estimated tau square for the non-randomized studies we just take the intercept and exponentiate it so we get around point two and to get the estimated tau square for the randomized studies we take the intercept plus the coefficient that represents the difference between the randomized and the non-randomized studies and we exponentiate this and we get about point four so we estimate that the amount of heterogeneity is about twice as large for the randomized studies compared to the non-randomized studies you can also see this now in the forest plot where the distribution of true effects for the randomized studies has a different mean compared to the non-randomized studies but also has a different amount of variance so this model allows the amount of heterogeneity to depend on one or multiple predictors or scale variables and so tau square now is allowed to be heteroscedastic now it turns out that the model above yields identical results to fitting separate random effect models within the two subgroups which is what i'm doing down here i fit a standard random effects model to the subset of studies where random is zero and where random is one then i collect some of the information from these two models in a table and so we get the estimated mu for the two subgroups and we get the two tau squares and these are identical to the two tau squares that we got from our location scale model but the location scale model is much more flexible for example you can include none one or multiple location and scale variables in the same model and these variables can also be different the variables can be categorical so representing subgroups as we have seen before or there can be also quantitative variables and you can also now test if the amount of heterogeneity is related to a scale variable or differs across subgroups and for this we have several different types of tests so-called wall type tests likelihood ratio tests and permutation tests for example in the model above alpha one represented the difference in heterogeneity between the randomized and the non-randomized studies so testing whether this is equal to zero is the same as testing whether tau square for the non-randomized studies is equal to tau square for the randomized studies on this slide you can see how these different types of tests can be conducted if you fit a location scale model and you look at the scale part of the output here's our estimate of alpha one then you immediately get the p value from the wall type test testing whether this is significantly different from zero if you want to conduct the likelihood ratio test you fit another model where you drop the scale variable so here I just have an intercept so this is a reduced model and then I compare these two different models I get the likelihood ratio test and the corresponding p value and finally if you want to do a permutation test now that reshuffles the data repeatedly in a certain way so there's a certain element of randomness which I can make reproducible by setting the seed of the random number generator here then I get the p value here in the output from the permutation test now in all three cases here all these tests suggest that there is no significant difference or sufficient evidence for a significant difference in the amount of heterogeneity between the randomized and the non-randomized studies as a more elaborate example let's consider the data from the meta analysis by banger drowns and colleagues who examined the effectiveness of a particular intervention for improving educational achievement in this meta analysis the effect sizes are standardized mean differences and the studies differed in their sizes so we have small studies and quite large studies and also in the subject matter that was examined so 28 studies looked at mathematics performance nine studies looked at a science subject and 11 studies at a social science subject we will now use both of these variables as location and scale variables in our model and so once we fit this model we get these results and so here the intercept corresponds to mathematics for the average effect size and amount of heterogeneity the coefficients here represent the difference in effect size for science and social science compared to mathematics or science and social science compared to mathematics with respect to the amount of heterogeneity and the coefficient for sample size represents the slope of the relationship in terms of effect size and amount of heterogeneity and here we do find some significant relationships in particular this one and this one the results therefore suggest that larger studies tended to yield smaller effects and that studies that examined the effectiveness of the intervention in science subjects tended to yield more heterogeneous effects but there was no evidence to suggest that the average effect itself differed for science subjects so this example shows that there can be different types of relationships in terms of the location or the size of the effects and the amount of heterogeneity these types of models therefore open up the possibility to examine entirely new research questions for example comparing different types of interventions not only in terms of the average effect size that they yield but also in terms of the consistency of the effects but it should be pointed out that due to the increased complexity of these models you tend to require a larger k or number of studies to obtain meaningful answers here and given that there are different types of tests for testing scale coefficients these wall type tests like the ratio test and permutation tests the question arises which type of test should you pay most attention to this is something that we are currently examining in a simulation study and finally I am looking at the possibility to fit such types of models in the context of the rma.mv function from the metaphor package which is for multi-level and multi-variate models although that type of extension is not entirely trivial this brings me to the end of this talk on this slide you have the references and if you have any questions comments or suggestions on this slide you can see how to get in touch with me thank you for your attention. Thanks so much Wolfgang Shinichi Nakagawa is next with meta-analyses with missing standard deviations with log response ratios. Hello my name is Shinichi Nakagawa today I'd like to tell you about our new method on log response ratio and some standard deviations are missing. First I'd like to acknowledge my love members at UNSW Sydney Australia and my co-authors Vex, Losia, Dan, Alistar and Wolfgang the cat he sometimes takes human form it looks like this okay I think some of you may be familiar with log response ratio but everybody probably knows coins D, edges G, this is just a small sample size collection the version of the coins D more generally they are known as the standardized mean difference and for the reason I'm gonna show so D equal me minus another mean divided by cool standard deviation log response ratio is a ratio of two means and it's log and it turns out to be the most used standardized or unique less defect size in ecological evolution so it's important defect size for evolutionary biologists like myself okay so you know the missing data everywhere and the meta like the data are no exception and generally looks like this then we collect trying to collect descriptive statistics from papers so you get mean standard deviation sample size for the control and the same thing for the treatment group or experimental group and quite often standard deviation missing sometimes occasionally and I'm missing but that's very rare so when that happens what we usually do is just delete those roles or cases because you can't calculate log response ratio sampling variance which we will see the reason why bit later on so this when we delete these cases it's called complete cases approach because we use just to use the cases where SD are not missing but this will lead to some biases overall mean estimate or other meta like parameters and this paper points out this and this paper says we should use multiple invitations otherwise or the meta analytic results are biased often yeah I guess and another interesting thing is a survey included in this study showed 75 percent of meta like data in ecological evolution have missing SD so it's a very common problem and multiple invitations certainly a solution but problem is when we run the complex meta analysis multiple invitations very difficult to implement for example when you have effect size multiple effect size from the same studies or ecological evolution we often have multiple species on the top of multiple effects per study then we have to account for what we call phylogenetic relatedness that gets very complex model in this case multiple invitation techniques very hard to implement so we I was kind of looking for the easiest solution compared to multiple invitation and I couldn't come up with tell I came across this paper by Don Castan Spake and Rebecca Spake the Bex you saw in the second acknowledgement slide uh in this study they did they conducted simulation study and they proposed this formula there with CV and I'd like to explain this so this is log response ratio you've seen this before important thing is the meta analysis you need the sampling variance inverse of this is going to be wait for each case is for each study and this has a SD mean sample size and you can see the SD divided mean can be written as a CV coefficient of variation and they propose rather than using the CV specific to study we can calculate overall CV average CV across all available study and you can use that and that would lead to less biased or more accurate overall meta mean and you are wondering why is that and the reason being they show this in the simulation is when n's are small and quite often many studies in the ecology evolution lots lots of study has a small n's or sample sizes SD are estimates so inaccurately yes those are estimated not true SD this will result some bias or an inaccuracy over all means so that's quite cool but a finale reading this paper toward the end of this paper they said actually um this CV average CV can be calculated there's missing data because average CV can be calculated without or you know excluding those missing cases and you can use average CV for those missing cases and I thought oh this was sort of everything this is why we conducted this new study and the proposed four different uh new method which I'm going to tell you about so as you have already seen this is a donkastas spec um estimator and they use for sampling variance overall response ratio they use average CV we made two improvements for this is we actually used it looks scary but it's just a weighted average CV giving more weights on the study with higher sample size another one is that those two latter terms it's a small sample size correction it's been like a hg to the coins d so what's the study so we had a four method but I want to just uh share with you the two methods of missing cases all cases so you've seen this before grades are missing data and we use this uh sampling variance let's call the v tilde that's our estimator from those missing a standard deviation we replace with our v tilde or all cases method we use all for all regardless it's missing sb or not use this v tilde for all the studies or all cases I guess so we conducted simulation and this our simulation actually we buried a degree of uh missingness from five percent to up to 30 percent it's surprising thing is bias log response ratio here is a bias in an overall mean but overall meta-analytic mean and all cases performs really well close to zero zero bias next to missing cases what really surprising so full data one is the family simulated that data set didn't have any missing data that's really counterintuitive because that's should this should perform least biased yeah with least bias but you may remember this don't don't cast on spec paper what they showed is sometimes each study has a two few sample size and that affects the accuracy of the meta-analytic mean estimate so that's what's happening so actually turns out to be using the new estimator our estimators for the sampling variance would perform the best and we don't show this here it performs in like estimating heterogeneity coverage everything is better for all cases and the missing cases compared to full data which is surprising so conclusion all cases method the best one but we should also probably run the missing cases as a sort of sensitivity analysis see you know you have a you get consistent result and what we can say the biggest conclusion is all future meta-analysis can use our method because regardless of how complex or how complex your meta-analysis model models are you can use this method easily because only you need to get you need to get a weighted average of cb then you can plug in one thing i didn't tell you about is assumptions normality uh log response ratio is passed proposed by hedges that's the same hedges hedges for the hedges d and this formulas seem to fail and there's a lot of counter data or non-normal yes so lots of counter data in ecology and evolution so we need to be careful in such case we propose several different um couple of different solutions for this all in this webpage it's connected to paper you can find this in the webpage for the paper or it's here and thank you thanks very much shinichi up next sabina patsul will be talking to us about using multiverse and specification curve analyses as an assessment of generality of effects for marzems over to you sabina hi my name is sabina patsul and i'm a phd student at the technical university of munich and today i would like to talk to you about how we use multiverse and specification curve analysis as an assessment of generality of effects for our meta-analytic structural equation model where we're looking at the relationship between creative potential and creative self-assessment measures i would like to start off with a quick theoretical background so basically it's assumed that everyone has a certain amount of specific of creative potential within them um two indicators of creative potential for example um divergent thinking and intelligence both of which correlate with creative achievement however um creative potential does not necessarily lead to creative achievements um as the relationship might be partially influenced by factors relating to the creative self such as for example creative self-beliefs this is why we chose to look at the relationship between creative self-assessment measures and two indicators of creative potential being intelligence and diversion thinking we use structural equation modeling specifically um t s s e m with an extension for random effects models and the vpl approach has multiple studies reported more than one effect size and we assume that to be dependences among effect sizes to test our robust or structural equation model is we use subgroup analysis to test um whether the parameters uh estimates are actually um equal across um csa type so creative self-assessment types and age groups um just a quick heads up these are all preliminary data so please don't interpret them and we just wanted to give you an idea how our models look like so for example we are looking at the relationship between divergent thinking and creative self-assessments and testing whether they are mediated for example by intelligence we're also um as I already told you doing subgroup analysis so we look at the different models depending on whether children's were assessed or for example adults were assessed um on downside of meta analytical structural equation modeling is actually um that you're it's kind of limited when it comes to how many uh different moderators you can include at the same time which is why we chose to uh also apply or use another approach to test the robustness of our model being models and specification curve analysis um as I already told you we're looking at different age groups um and different creative self-assessment types but there are various other mediator variables for example um what kind of divergent thinking outcome was used um the test method modality of divergent thinking whether there was some kind of time condition in the tests and so on and all these different moderators you could say could potentially influence our primary studies and in that they could also influence our moderator or our or metadata to structural equation model um when doing a multiple analysis these um different factors which I told or called moderators before I actually called which factors as you're thinking about which uh data could possibly been have been analyzed in the primary studies and you're also looking at how factors um which basically um tell you how the data is meta meta analyzed and what you're doing is you're combining all these different kind of which factors and the subgroups so for example um whether female sample was used children were investigated fluency was reported as an outcome and so on and you're combining these with the different how factors and these lead to a lot of different specifications and for all of which we are computing meta analytical summary effects and at the end we're basically comparing all these possible meta analytical summary effects and we get a mean over these summary effects and it tells us how robust um a correlation is depending on which data was analyzed and how the data was analyzed and yeah we're doing the meta of these multi version specification curve analysis for all our bivariate relationships so for csa and divergent thinking divergent thinking intelligence as well as intelligence and csa um just as a way to get an idea of how robust our meta analytical structural equation model actually is um I would like to show you our specification curve um from the relationship between the version of thinking and creative self-assessment measures um to give you an idea how it could look like or it looks like at the moment and yeah so at the top part of the model um you can see and the dark line which are the summary effects from all our different specifications and these colored areas are the respective confidence intervals so for example um this summary effect over here with this quite large um confidence interval then we're going down to the bottom part of the plot um was uh computed using a total sex sample um with adults verbal modality and down here you'll have the the how factors so this gives us um the information on which kind of which factors and how factors were used how do we are combined and what kind of summary effect they yielded we also get information by the colors um so for example or uh warmer or hot colors represent combinations of which and how factors that only included a few effect sizes and cooler colors such as blue or green represent a combination with more effect sizes and when looking at our graph you can see that um there are a lot of effect sizes in the middle of the plot and that actually the magnitude doesn't increase um a lot so most of our effect sizes are actually in the area of point one to point two and which basically speaks for the robustness of our effect of especially when you're considering all these different kinds of possible specifications next off um we looked at or we look at the look at the um total summary effect from all these different kind of specifications um which is a correlation of point 17 and as you can see 50 percent of all our different combinations of um which and how factors yielded a summary effect between point 13 and 21 so our um correlation seems to be quite robust we can also use a parametric bootstraping approach to do inferential testing the red curve represents our specification curve and the gray area is basically the curve under the null scenario of the possible zero effect which clearly deviates from our specification curve so at least for the relationship between direction thinking and creative self-assessment measures are the included correlations seem to be quite robust um so um you could say it doesn't matter that we can't include um so many different moderated variables and um as the correlations seem to be quite robust um that's it for my talk I hope it gave you an idea how you can multi-verse use multi-version specification curve analysis to test the robustness of your meta-analytic structure equation model as I already told you we would like to do this the same analysis for the other two Bavarian relationships as well to test how um robust our model is um yeah if you're generally or if you're interested in our project feel free to write me an email we also did a preregistration of this study and that's it for today thank you thanks so much Sabina that's it for this session I hope that you've enjoyed it it was a really fascinating set of talks lots to get your teeth into we'll be coming back again shortly but for now that's it please keep your questions coming in via twitter the presenters will be around for the next few hours or days so if you tag them in a question they'll be able to get back to you as soon as they can thanks very much