 Hi there, my name is Wolfgang Fichtbauer from Maastricht University in the Netherlands and I will be presenting on the metadata package, a collection of meta-analysis data sets 4R. One of the nice aspects of meta-analysis is that there often, not always but often, include the full data set that was used for the analysis. So for example, down here you see table 1 from the meta-analysis by Kolditz and colleagues on the effectiveness of the BCG vaccine against tuberculosis. So for each of the studies included in this meta-analysis, we have the number of participants in the vaccinated and in the control group, the number of tuberculosis cases in each of these two groups, based on which we can compute an effect size measure like a risk ratio, which we can then meta-analyze. So by extracting the data one can build up an entire collection of meta-analysis data sets. So we don't have to go back to the primary studies to find the data, we just go to the meta-analysis, look for a table or an appendix where the data are reported and this we can then include in our database. And this is useful for various purposes, for teaching purposes, for illustrating or testing methods, validating the analyses conducted and for conducting sensitivity checks. So what you see on this slide is the BCG data set that was included in the meta-for package. So here we have essentially the same information as reported in table 1 except that we have the number of cases and non-cases in the vaccinated group and the number of cases and non-cases in the control group, but this is essentially the same information. We can then use the ESCalc function from the meta-for package to compute the effect sizes. In this case, these risk ratios are more specifically, we will compute log risk ratios because this is what we need for the analysis. So here we have now this variable with these log risk ratios and the corresponding sampling variances. These two variables, we then pass on to the RMA function to conduct our meta-analysis using a random effects model. So here are the results. Most importantly at the end, we will back transform the results to an estimated average risk ratio with a corresponding confidence interval. So based on this meta-analysis, we estimate that on average, vaccinated individuals have a 50% lower infection risk than non-vaccinated individuals. And we can compare these results with what is reported in the meta-analysis and we see that this matches up exactly. So we are able to reproduce the results from the meta-analysis exactly. In addition, we could conduct a bunch of sensitivity analysis to see how robust the results and conclusions are when we switch to a different method or model for meta-analysing these data. So down here, we have the results from the standard random effects model, but we could switch to a fixed effects model, or maybe use a so-called binomial normal model and the different types of those, or maybe one of several different Bayesian models to analyze these data. Here we find that the results are relatively robust. So regardless of the method or model use, we find on average roughly a 50% lower infection risk in the vaccinated group. The only difference here is the fixed effects model where the results are slightly different. Over the years, I have included more and more data sets like this in the metaphor package. So each of these dots is a version of the metaphor package that was released. And so over time, you can see the number of data sets included, and eventually I got around to about 60 data sets included in the package. At a certain point though, I had the idea of moving these data sets into a separate data package. This would make it easier to add data sets without updating the code of the metaphor package, and it would make it easier for others to contribute data sets. I started working on this at the 2019 evidence synthesis hackathon together with Thomas, Emily, Daniel, Alistair and Kyle. And we eventually released the first version of this package two years later, so that was a bit of a delay in 2021. So you find the link here to the CRAN page for the package, the Github repo, the documentation can also be found online, nicely created using the package down package, and currently the package includes 79 data sets. The data sets have a consistent naming scheme. So they're all called dot period, the author, the first author of the meta analysis from which the data were extracted and the publication year. And the data sets are also documented in a consistent manner. So we have a general description of the data set, a description of each variable included further details about the data set or the meta analysis, the source of the data. So this is typically a publication from which the data were extracted, potentially other relevant references, maybe if the same data were used in other publications, the person who extracted the data. So in case you find a mistake in the data extraction, you know who to contact examples illustrating the use of the data. So this could be just some simple analysis or maybe a full replication of all the analysis conducted in the meta analysis and then these concept terms. So each data set is tagged with one or multiple of these terms and they may describe the field or topic of the meta analysis. And you see some examples here. They may also describe the type of outcome measure that was used in the meta analysis. So we have meta analysis conducted with correlation coefficients as we saw earlier with risk ratios or standardized mean differences. And these concept terms can also describe the types of methods that were used in the analysis. So we have maybe meta analysis using cluster robust inference, meta regression type of analysis, multivariate models, network meta analysis, maybe data sets that include some outliers or that are illustrative of publication bias. So these concept terms are really useful to find particular data sets that may be of interest to you, but of course they need to be used across these different data sets in a consistent manner. And this is a bit tricky. So if you add or change a concept term retrospectively, you have to go back through all the existing data sets and make sure that they are tagged accordingly. And you can find a full list of the concept terms used at least so far under this link. If you just want to get a listing of all of the data sets included in the package, you can use this R command or you can look at the listing online. In addition, the package includes a function, the dot search function, which allows you to search through the data sets based on their concept terms or even based on a full text search of the help files. By default, that search is based on the concept terms. So here we would find all the data sets tagged with standardized mean differences. We can also search based on multiple terms or again do a full text search if we set concept equal to false. Quite importantly, we want others to contribute new data sets to the package. So we have created a detailed workflow for doing so, which you can find under this link. And let me just describe some of the guiding principles here. So first of all, we want the data sets to be named in a consistent manner. In addition, we want to make a distinction between the raw data and the data actually included in the package. Meta-analysis data sets are often very large, not so much in terms of the number of studies included, so the number of rows, but in terms of the amount of information that is extracted from each of these studies. So often we have many variables, but not all of these variables may be so interesting for inclusion in the package. So there should be a raw data file and a data preparation script that takes the raw data file and turns it into the R data file that is included in the package. Finally, we have written a function, the prep.function, which helps people to document their data set. So for each data set, there needs to be an RD file. And this is the help file for the data set. And not everybody is familiar with how to write these RD files. So the function creates a template that then can be completed by the person contributing a data set. And once you have created all these files, so you have the raw data file, the data preparation script, the R data file, and the help file, then you can either make a pull request via GitHub, or if you're not familiar with the Git workflow, you can just send us the files and we will include them in the package. At this year's ESMA conference, we have also set up a hackathon related to the metadata package. The first goal will be to take the dot search function and turn it into something a bit more fancy, namely a shiny app that replicates its functionality, but in a slightly more interactive way. The hackathon will also be a nice opportunity to add additional data sets to the package, maybe improve the documentation we're needed and consider additional functionality. So if you would like to contribute a data set to the package, but you don't know how well we will be available doing the hackathon and we can help you add your data set to the metadata package. So the results and outcomes of this hackathon will be reported at the end of the conference. So that's it. Thank you for your attention. I also want to thank my co-authors of the package and all of the people who have already contributed data sets to the metadata package. And if you have any questions, comments or suggestions, down here, you can see how you can get in contact with me. Bye bye.