 All right, thanks everybody for joining us today in our presentation of the MetaPype X framework for data analysis and termination of multi-replications of experimental designs. And some of you might have already heard a presentation on this, but this tutorial is a bit more in-depth and much more hands-on. So let me start off by introducing our team. I'll be joined by my colleagues Lukas Beinau and Maximilian Frank today and they are going to present the practical examples from this tutorial. And my name is Jens Wünderich. I'll be guiding you through the other parts of this tutorial. So I want to start by introducing the data format that we're interested in, which is multi-lab data. And that will mean projects like the many labs or multi-lab registered replication reports. And these are direct replication efforts across multiple replication sites. Yeah, and the most basic form of interpretable data is item data, where each row is a participant and you have one column per item. If you aggregate these, the information of the dependent variable is reduced to a single numeric value per participant, giving you an IPD, individual participant data. And by aggregating results across participants from a single replication site, you achieve replication level data with, for example, a between-group effect size. And then these replications are gathered met analytically to receive estimates for a replication project. A replication project is basically just a single primary effect replicated across multiple replication sites. And a multi-lab then may span multiple of these replication projects as the many labs do. But despite of these similarities across multi-labs, we found very unique solutions by the different multi-labs to set up their code and repository structure. Many labs too, for example, provide an R package that contains their data transformations and analyses. In some projects, data was manipulated in a centralized way, and some there were some of the transformations then performed by individual replication sites and so on. Data availability is also somewhat of an issue because finding clean data sets at different steps of aggregation is pretty complex in some cases. And information is stored in a multitude of data files ranging from SPSS file to CSV, Excel and Word documents, online Google documents in different formats, R data and many more. And there are multi-labs that seem to be very aware of the implications of this complexity and what it means for their code and provide helpful documents like bug trackers so that the community can actually interact with and reanalyze the data and making the process less ambiguous because you can actually see who dug up an issue in the code or something like that. But faced with all these varying solutions and because we need a sort of standardized process for our own work, we developed the Metapype X framework, which I'm presenting to you today. So these are the three components of it. We developed a pipeline that basically represents the data transformation process or analysis process, the R package, which basically takes this process and translates it into R functions. And then the Shiny app that will kind of work with the standardized data sets that Metapype X produces and gives you the opportunity to combine data, to explore the data in different visual visualizations, use subsets of the data. Yes. Now to be a bit more specific about the type of data Metapype X may be used with at the replication level, we're looking at two group designs where we compare a metric dependent variable between two groups. And we started looking at it from an experimental perspective, but you could basically compare any other group than treatment and control. And at the multi-level level, we want the data to come from replications using the same operationalization of the dependent variable. So we were mostly interested in direct replications, but you can use it with conceptual ones. So just make sure to check the implications for different comparisons you might want to make. Yeah, and the multi-level structure of course should be nested so that participants are aggregated in replications and these are then meta-analysed. So we'll kick things off with the pipeline, which you see here on the right. And the pipeline basically is a version of the charge at the beginning from the vocabulary slide. But here we applied it more closely to the actual data and code structure that could represent this process well. So we basically took it, flipped it by 90 degrees and just fleshed it out a bit more. And if the graphical representation seems a bit much at first sight, don't worry. We're going to break that down now. So the green rectangles are data files, ideally CSV files, and with a clean version available at different levels of aggregation. The pipeline goes from raw to meta-pavex data, a combination of replication and meta-level data. So we can focus on specifics of a single replication project, as in the forest plot, but also compare meta-analytical results from a multi-lab for example. The second type of information in the pipeline are rounded boxes which represent non-proprietary code files. So the white ones are those where we felt it wouldn't make sense to standardize, for example, because replication projects aggregate their dependent variables differently, and the yellow ones are those standardized steps, and those are the same cross-replication projects. So we translated the standardized steps into our functions in order to apply this computational structure, and so created four different functions in doing this. Now in the square on the left side, we isolated the use of a single function, the meta-analysis function, which utilizes the RMA function from Metaphor. The idea is to take a data format from the pipeline, in this case the merge replication data, throw it into the pipeline function, and then you receive a data frame with its results. And this logic applies to these three functions here, and they all create and give you an output of the analyzed dataset and provide you with a codebook just to make sense of the output that it gives. Now the full pipeline function then is built on these other three functions above. So using the full pipeline function, we just need to provide individual participant data to specify the column names, and then it returns the folder structure with the data and the codebooks at the different levels of aggregation. And then I just want to add a few points that we're trying to keep in mind during the development of these tools, especially in the development of the analysis functions because of their relevance when actually applied. But yeah, these ones are using non-properative software, maximizing computational reproducibility and making sure the functions scale okay. So sometimes the latter two points point in different directions, but I hope we found solutions for these that people can actually work with. Here we present a first impression of the Shiny app, the rest of which you'll see in a couple of minutes. Yeah, so just to give you a general overview, you can upload data in different formats. You can combine datasets, you can exclude datasets, create visualizations, and explore the data in the Shiny app. So without further ado, I would like to introduce you to the MetaPipe X users. We'll be guided through the first part of the example with the perspective of a multi-lab researcher who's trying to document a replication project and then one of the third party researcher with the goal to combine different datasets and then look at them together. Before handing things over to Lukas, I want to introduce the effect that helps us as an example for now. The ego depletion effect basically refers to the observation that an active exercise and direction of self-control leads to an impairment in controlling one's own behavior subsequently. So in an experimental setting, you would have an ego depleting task for the experimental group, some random task for the control group, and then ego depleting task for both. And the idea is that the control group will show better self-control in the second task. Hi, this voice belongs to Lukas Beinhauer. And over the following minutes, I will present the use case of a multi-lab researcher who has collected their own replication data and aims to deal with those in a standardized way. Thereby, I will make use of a total of four functions found in the MetaPipe X package. Create replication summaries, merge replication summaries, meta-analyses, and full pipeline. So what you see here is a Quarto markdown document, which you can follow along now or later. For the sake of the presentation, I want you to imagine that I am the one who collected the data actually collected by Deng et al. in 2021. This is data on the ego depletion effect, where they try to replicate it across 12 different labs. Obviously, in order to use MetaPipe X, we would first need to load some relevant libraries. In this case, there's Haven in order to load the SPSS files, and we have DevTools in order to actually install the MetaPipe X package. In this code chunk right here, we actually go ahead and install the MetaPipe X package from the GitHub repository. To go through the different steps for our dataset, we would obviously need to load the ego depletion data first. This short code chunk here does just that. The first step in the MetaPipe X package is to create standardized summaries of the lab data. For each condition separately, we would compute estimates such as group mean, standard deviations, or standardized mean differences. All of those come with a standard error. Within MetaPipe X, the function create replication summaries creates these sample aggregates. All we have to do is enter the data and specify which variable defines the treatment groups. In this case, the variable is called condition. And we need to specify the dependent variable, which in this case is referred to as the error rate. After doing so, since we work with a list, the function will return a list again. Each list element contains a data frame, which gives us the relevant aggregates as in sample size, mean values, standard errors, and so on. I want to highlight that sometimes, such as in group C here, which refers to the control group, there is an NA. That is because we didn't properly clean the data beforehand. There are some missing values of the control group, and the function therefore will return an ACE. Cleaning your data will always need to occur beforehand. However, here this is not the case for all of the groups, and we will see later on that for some of the groups, we will get all the values and can calculate all the other mean aggregates. So the second step would consist of combining these aggregates across labs. In MetaPipe X, there's the merge replication summaries function, which simply takes as an input the output from the previous function. We have the data replication summaries object, which we feed into the function. If we simply run this code chunk, we get a full data frame containing all of the previously generated aggregates, which you find here in an e-table. The last step would be to perform a random effects meta-analysis from those previously generated aggregates. The function called meta-analyses within MetaPipe X returns different parameters of heterogeneity, such as tau, i squared, or accompanying p-values. Typically, in a psychology study, we would look at mean differences, be it standardized or un-standardized. If we just use the meta-analyses function, which again simply takes the output from the previous function as an input, we get again a neat data frame, giving us a number of relevant parameters, such as the meta-analytic mean of the standardized mean differences, or heterogeneity. Additionally, the output of all functions comes with a codebook, the specific commands for which you can find on GitHub. The codebook gives you an idea of what the different variables actually mean. So now, we have went through the three most important steps, which we can automate with the MetaPipe X library. We might also choose to do it all at once. In that case, we just use the full pipeline function, which simply takes the same input as the first function did. We specify the group variable and the dependent variable. And what that does is it returns a list where we can find all the results that we generated using the specific functions as well. So the second object of the list is the replication summaries. Third object is the merge replication summaries. And the fourth object would consist of the meta-analysis, generating exactly the same results as we had before. Hello everyone, my name is Maximilian Frank. On the second part of this tutorial, we take on the role of the third-party researcher. Therefore, we don't collect our own data, but we want to work with freely available replication data from the ego depletion effect. And as a very busy third-party researcher, we don't have the time or want to avoid the effort to install the MetaPipe X package on our own RStudio installation. Instead, we want to use a Shiny app, which is comfortably available in our browser. The Shiny app is hosted by the LME Munich within the MetaRap research project, and you can type in the following URL to get to it, as this is right now still a test version the URL might change in the future. So here you see the user interface of the Shiny app with the different tabs at the top, basically going from left to right and representing the different functions. So at first, we need to start with the upload of our data. Therefore, we select individual participant data in the drop-down menu to the left. So basically the raw data, which was gathered during the replication project. As a next step, we click spores to access our hard drive. Then we need to tell the MetaPipe X how the dataset is structured. Therefore, we select the regard and columns, and the first one is for multi-labels project, then for replication project, it's replication. The different lab replications itself are in the variable source. The dependent variable is in a TV, and the grouping variable is done by treatment. If all of this is done, we click provide MetaPipe X data format with the app, which basically uploads the dataset to the server. And then we can go to the next tab, which is data selection. And there we see on the right the data table with different columns. And on the left, we can select which different statistics we want to calculate it and be shown in the table. So basically we have the standard replication statistics, the control mean and the treatment mean of the groups. And we can also display effect size, like the mean difference or the standardized mean difference. And then we have also meta-analytic results, like different estimates for heterogeneity, like I2 or tau. And if we then scroll up, we see that the selected measurements are added to the data table. We can also get information about the sample size and the number of different labs K in which the replication was conducted. And it's also possible to exclude null effects in these results. All the results which are shown here can also be downloaded as a comma separated value file. And we will also see that the graphics can also be downloaded. So there's basically no need to take screenshots of the results you see here in the app because you can also extract this and save this to your hard drive. So let's go for the next step. You can also exclude different datas in the Shiny app. So for example if you know that in one lab the manipulation check fails. So for example the lab brand then you can select it here and then click on exclude. And then if we go back to the data selection tab we will see that this lab is now gone from the analysis and is not part anymore of the selection. So last but not least I want to show you some of the built-in visualization features. So we go to the tech histogram and click on upload data. And then we can again choose to our left the statistics we want to display. So in this case I going for the mean of the control group and then I can also include if I like a second variable if I click the checkbox. And then for example I click also the treatment, the mean of the treatment group and then we see BOSCH visualized in the histogram. Another important feature in the context of meta-analysis are of course forest plots. So same procedure as with histograms we click once on upload data and then I can select the effect size I want to visualize. In this case we going for the standardized mean difference and then I also need to select the standard area of this estimate. So the SE for the standardized mean difference and then we select the level of aggregation which is in this case the different labs where the replication was conducted. So now I can also show you the download function. You also basically click on download forest plot at the bottom and then it's open as a PDF file in a new tab and now you can use it for your manuscripts, for your documentation. I think it's a really easy and straightforward way to reuse the graphics which were generated in the Chinese app. So one other nifty function of the app is the codebook which you can access by the tab at the utmost right. And there are basically all variables shown which represented in the app so far with a different explanation so to make it a little bit more explicit what the different abbreviation stands for. And yeah, that basically concludes the short tutorial of the Shiny version of the MetaPipe X and I hope you liked it. Perfect to wrap things up. Of course you can also use the app with simulated data for example and a version of that is available on GitHub for effects of PI King on heterogeneity. And for more details on MetaPipe X please check out our preprint on PsychArchive and that's it from us. Thank you so much for your attention.