 Okay, hello everyone and welcome to EsmarConf 2022 and the special session number eight, developing the synthesis community. As always, the session is being live streamed to YouTube and the individual presentations have been pre-recorded and published on YouTube as well. Subtitles have been verified and can be auto translated for those individual talks, automatic subtitles will be available shortly for this live stream. If you have questions for our presenters, you can ask them via the presenters individual tweet from the at esh hackathon Twitter account, which is also on the slide here. Presenters may have time after their talks to answer some of the questions or at the end of the session if time allows, we will endeavor to answer all questions soon after the event. We would also like to remind you of our code of conduct, which is available on our website at github.io. So now I would like to introduce you to our first presenter Wolfgang Wittbauer from Maastricht University Wolfgang Iran. All right. Just a second. All right. Here we go. My name is Wolfgang Fichtbauer from Maastricht University in the Netherlands and I will be presenting on the metadata package, a collection of meta analysis data sets for arm. So one of the nice things about published meta analysis is that they often not always but often include the full data set that was used for the analysis. So for example, down here, you see table one from the meta analysis by coders and colleagues on the effectiveness of the BCG vaccine against tuberculosis. So for each of the included studies, we have the number of participants that were vaccinated, those that were in the control group and the number of TB cases in each of these groups, based on which you can compute an effect size measure, and then conduct a meta analysis. So by going to meta analysis, we can actually build up a whole collection of meta analytic data sets. So you do not have to go back to the primary studies and extract the information from those which would be really tedious and time consuming. Somebody has already done this. And if you're lucky, you can find the data set in a table or maybe in an appendix, and then we can put it in our database. And why is this useful while for various purposes for teaching purposes for illustrating or testing methods for validating the analysis that were conducted and for sensitivity checks. So what you see here is the BCG data set as it was included in the metaphor package. So again, we have the studies, and we have the same information that was in this table one, except that instead of the number of cases and the group sizes, we have the number of cases and non cases in the treated or the vaccinated group and the same photo control group. But this is essentially the same information. So then we can compute the same effect size measure that was used in the meta analysis by coders and colleagues these risk ratios or more specifically log risk ratios. So now we have those in our data set and the corresponding variances. Then we can pass this information to a modeling function like the RMA function from the metaphor package and conduct a random effects model meta analysis. Here I use the DL, the Der Simone and Lert estimator because that is also what was used by coders and colleagues. So we get these results. I'm not going to discuss these. That's not the focus here. We can back transform the estimated average log risk ratio into an average risk ratio. And here we have this estimate. So on average vaccinated individuals have about half of the risk as the those in the control groups with a corresponding confidence interval. And then we can compare these results to what was given in the meta analysis. And we see that this matches up exactly. So we are able to reproduce the findings or the results from the coders at our meta analysis. In addition to this, we can conduct some sensitivity analysis. So again, here are the results that we just looked at, but we could maybe use the mental Hansel method, which is a fixed effects model, or use one of these binomial normal models or various patient models. For example, here are the results from the base meta package that we heard about yesterday. So we can compare these results, see how different or similar they are and essentially do a sensitivity analysis. So over the years, I've included more and more data sets like this and the metaphor package, each of these dots is a release of the metaphor package. So over time, you can see that the number of data sets keeps going up and eventually I got to around 60 data sets. At a certain point, though, the idea arose of moving these data sets into a separate data package. Why? Well, it would make it easier to add data sets even without updating the metaphor package itself and the code in the metaphor package might not change, but you might want to add some data sets. And also then it would be maybe a bit easier for others to contribute data sets, it might seem a bit more daunting to make a contribution to metaphor, instead, if there's a data package, then this is maybe a bit easier to make a contribution. So we started working on this, on this at the 2019 evidence synthesis hackathon. So together with Thomas, Emily, Daniel, Alistair and Kyle, and then eventually we released the first version on CRAN two years later. So there was a bit of a delay for various reasons, but eventually we got it out there. So here's the link to the CRAN page, the development is running over GitHub, as everybody does. You can read the documentation nicely formatted online using the package down package and the metadata package now includes 79 data sets. The data sets all have a consistent naming scheme, so they're called DUT period, then author. So the first author of the meta analysis from which the data set was extracted and the publication year. And these data sets are all documented in a consistent manner. So you have a general description of the data set, description of each variable included in the data set, details about the data set or the meta analysis, the source of the data. So that's typically a publication from which the data were extracted, maybe some other relevant references in case the data set has been used in other publications. Then the person who extracted the data, so in case you do find a discrepancy between the published data and what's in metadata, you know who to contact. And then examples illustrating the use of the data. And this may even be a full blown replication of all of the analysis conducted in the original meta analysis and then some concept terms. So what are these concept terms each data set is tagged with one or multiple of them, and they may pertain to the field or the topic of this meta analysis you see some examples here. They may also describe the outcome measure use so you have meta analysis using correlation coefficients or as we saw earlier risk ratios and maybe standardized mean differences. And then also the types of methods that were used in the the meta analysis or that can be illustrating with it illustrated with a data set. So for example cluster robust inference or multivariate models network meta analysis. There are some data sets with some outliers so this is quite interesting to look at or maybe data sets that can be used to illustrate publication bias or evidence for that. So these concept terms are really useful to find data sets that you might find interesting. But of course they need to be used consistently across all of these data sets. And if you make changes to concept terms. Well that's a bit tedious because then you have to go back through all of the existing data sets and make sure that they attacked accordingly. And under this link you can find a listing of all the concept terms that have been used so far. If you just want to see the data sets included while you can do this here with this command or you can go to this link. So let me just click on this just to show you right so you have all these data sets, and then, well, you can take a look at them. So again description the format some details the source, maybe another reference related to this who extracted the data set these concept terms and then examples, illustrating the use of this data set. You can also search for data sets the package includes the dot search function, which allows you to search based on the concept terms by default or full text search of these help files. So for example if you want to find data sets, tagged with standardized mean differences, or maybe multiple key terms arts ratios multi level, or if you want to do a full text search. Well then you just set concept equal falls, and then these had files will be searched based on the full text. What we also want and this is really one of the most important aspects here is for people to contribute their data sets to the package. So we have set up a pretty detailed workflow of how how this should go. So how can you contribute your own data set if you have done a meta analysis and you would like to include it in the metadata package. And there's some guiding principles here. So first of all we want these data sets to be named in a consistent manner. We also make a distinction between the raw data in the data set that is actually included in the package. The meta analytic data sets are often large, not so much in terms of the number of rows, but the number of columns often you extract a lot of information about the studies included in a meta analysis. And not all of these variables might be so interesting for inclusion in the data set that goes into metadata. So there's the raw data set that might have been extracted from an appendix. And then there is a data preparation script that takes this raw data set and turns it into maybe a slightly cleaned or condensed version of the data set that actually goes into the metadata package. And in addition we wrote a function that helps people to document their data set. So what this function does is creates a template for the help file. Not everybody is familiar with how to write these RD files. And this function will set up a template that you just need to complete them. So that's pretty straightforward. So once you have put together all these files, you have the raw data file, the data preparation script that creates the our data file and the help file. You can either just make a pull request via GitHub if you're familiar with the Git workflow or just send us the files and we'll be happy to include them in the package. So at this year's conference, we also were running a little hackathon along the way. And so what was the goal of this hackathon while we wanted to create a shiny app everybody's creating shiny apps these days so we also wanted to we also want a shiny app. And what does this app do or what should it do it should sort of replicate what that search does, but in a shiny way right in a slightly fancier, more interactive way. And the hackathon was also an opportunity to add some additional data sets maybe improve the documentation think about additional functionality, and I'll be happy to report on the outcomes of this hackathon at the next at the closing session. So that's it. Thank you for your attention. I want to thank my collaborators on the package all the people who have already make contributions to the metadata package. And if you have any questions comments or suggestions, I'll be happy to hear them. Thank you. Thank you Wolfgang for a really fantastic presentation and package and we do look forward to hearing about the updates during the closing session today. We are going to hold off on questions until after all the presenters have gone so don't forget to send us your questions on YouTube or by Twitter. And right now we're going to head over to our next presenter. I'm Mark Lejeunesse, who is coming to us from the University of South Florida. So, Mark, you're on. Hi. I'm Mark Lejeunesse and forgive me for the candid presentation here but I'm really going to talk about three lessons I learned about trying to get undergraduates involved in screening studies for systematic reviews or meta analysis. I've also tried to use them to help classify things for test data sets for machine learning learning projects. And I'll just flat out say that it's not a straightforward endeavor as you could imagine. But what I want to start with is what's my motivation for doing this in the first place and every year or every semester I get really really excited when I'm standing in front of an auditorium of potentially hundreds of students, thinking well, how can I leverage this situation into a scenario where I could steal all their energy to code, classify, screen, many, many, many studies for systematic reviews. And so I've tried this for about eight semesters now. I'm doing one right now for my parasitology course. And so I'm going to try to describe a little bit some of the challenges that I face when getting so many students involved in this process. And maybe most importantly how I fall flat every semester in achieving high quality screening decisions by these students. And oh boy, let's just jump into it. Let me talk a little bit about the population of screeners. In general, they are undergraduates, they can range from their first year at the university straight out of high school to seniors, they've been around for four or five years. And so you could write there you know that there's a variation in confidence and experience with science. And of course anyone who's ever tried to embark in a screening project. You really need to understand the language involved in how science is reported and summarized. And so the first challenge is undergraduates tend to vary quite a bit in their expertise or I wouldn't say interest with their confidence in making screening decisions. The range of classrooms that I've experimented with, and these are not formal experiments, these are just me getting excited each semester, trying to get students involved. There are about 40 to 200 students. And it doesn't matter the size, or the number of students involved. Really, the challenge is making sense of what they screen and what they achieve. Now the last thing I want to discuss is a the number of studies that I've attempted to a screen or classifier code with the students and I haven't been overly ambitious with this. It's usually about 300 to 2000 studies. And the reason I'm not super ambitious like when you have 200 people ready to go. It really almost makes it impossible for me to verify the decisions they make. And so if I we go super ambitious and try to screen 10,000 studies. And in terms of quality control that makes it very inefficient for me to proceed with other phases in a research synthesis project. Because I have to verify and check and make sure that everyone is on the same page in terms of making good decisions to screen to exclude include studies for systematic review. Let's begin. Here is my first lesson I learned by using hundreds of undergraduates to try to screen studies. Dual screening designs really do not make sense in this scenario. Now a typical dual screening design is you have to reviewers screen the same collection of studies, and then you use a say a CAPA statistic to assess consistency between the reviewers in how they agree or disagree with what should be included or excluded. This really breaks down when you have hundreds of students, because the paired design is not adequate enough to create high quality, repeatable decisions. And the reason for that is not all students are on the same page, or at the same level of making the screening decisions. No matter how much work I put into it to prepping them, you know getting them ready with using some sort of pico framework of trying to answer questions when they're screening the title and abstract. There's always a cohort of studies of students I mean that are just not not making effective decisions. And because we tend to approach the screening process as part of a project, we have many phases to go through in a semester. And so I really don't have much time to train them and test them over and over and over again to make sure that everyone's on the same page. So the dual screening stuff really doesn't work because what you end up happening is having to a drop entire clusters of poorly graded CAPA statistics collections of studies that were just complete inconsistency in screening decisions. Which makes the whole process again inefficient because you got to, we have to revise and review these things over and over again. And when I first started this, I would repeat that process three four times, right so what would just typically be one screening about would snowball into a second, and a third screening about just a double triple check screening decisions. So now I kind of converge on a totally different approach where I'm not evaluating groups based on consistency. I evaluate individual studies on consistency. And so what I do is I take a random sample of the students, say about 20 students in a class, and they each independently evaluate a single study. And that becomes my measure of decision consistency is kind of like the sampling experiment where I'm having many, many, many students evaluate the same title and abstract for inclusion and exclusion. And then if the statistic the consistency statistic is high, then it's probably means that there is is valid to be included or excluded. So this leads me to my second lesson. And this one is by far the one I have the most trouble with at all. And I'm going to need some time to explain this because I, it's not at all intuitive. There tends to be a trend of high consistency in what to include when screening studies right students agree very highly on what to include. What has the most variability or variation in screening decisions is on what to exclude. In an ideal scenario, right there, the probability of inclusion and exclusion should be similar. However, here in this case what we have is students really hesitate and and are challenged with making exclusion decisions. Forgive this cloud of data points but this is the, this is the nature of using undergraduates for screening decisions each axis here represents a two bouts of screening decisions on the same studies. So one week we screen 250 studies. The other week we re screened those 250 studies. So here is the is the repeatability statistic of making a decision on whether or not to include or exclude an individual study. So each point here is an individual study, and about 20 students made decisions on whether not to include or exclude that study. So based on two bouts of the same studies you can see there's a lot of noise, right, even though they're doing the same thing over again. Decisions are not highly repeatable you can fit a line through this and there is a correlation between the two. But the important thing to emphasize here is that there's high agreement want to include between two separate bouts. But there's a lot of noise associated with excluding the studies. So, you know, even though the Pico statement is very useful in making decisions on what criteria. You want to hit on when you're reading a title and abstract. It doesn't really help you much with making a decision on what to exclude, especially if you're not fully confident in understanding scientific summaries and abstract is a very unusual type of writing, right, it's concise, it's short. It tends to be way more dense in jargon sometimes because you got to squeeze in as much information as you can in a small amount of space. This actually makes it I feel like makes it more difficult for undergraduates to read and understand what's happening, especially when we're trying to do a research synthesis project. This leads me to my final challenge or lesson learned. And I this is by far the dimension that I experimented the most with it again informal experiments is a the tools will break. I mean when you have 200 students, and they're using their phones, they're using maybe their parents computer, maybe they're using computers at the library to do screening decisions. That inconsistency is makes it difficult to on my end to make sense of what they achieve. So throughout the years I've experimented with many, many different things. I started off with using R generated HTML forms that populate Google spreadsheets. But what happens with that is not all browsers are friendly or open to form fillable things. And because people are using their phones and different tools to navigate bit to browse the HTML files. This creates a lot of hiccups when assignments are due. And usually there's like a cohort of students who are having problems submitting their screening decisions, because that there's like a technology issue going on there. Other approaches I've used at PDFs. So again you generate a collection of PDFs and are their form fillable fantastic. However, once the PDFs are in the hands of students, they're using different applications to to read these things. And, and then when they save their effort. It's in a totally different format. And when it ends up in my hands. That results in a bunch of implementation issues because PDFs are by far the hardest files to crack open and mine information especially form fillable data, because depending on the application you use to save the form. If you want to populate the forms, it may be in a totally different format and ours not really set up to process variations in form fillable out decisions. So what I've converted on now and it's really just this ridiculous acrobatics is I use canvas, which is the software they use to take exams, watch lectures, all that stuff and I bamboozle it for a synthesis projects. In R what I do is I take a CSV file of abstracts and titles. I convert it to a I always forget what the file name is I got it written right here, a QTI format, which is the only way canvas accepts questions, and I convert the quizzes that they take online into screening, and I convert it into a revision technology. This is by far the most efficient way for me to do stuff, even though it's harder for me to implement. It's easy for them to do because they have to use canvas for all of their lectures. And when they're done their assignment they could forget about it and I get it in a semi convenient consistent way where I don't have to worry about differences and how they saved their outcomes or how they submitted their outcomes. It's somewhat consistent, although personally, it is a ridiculous endeavor. So, there's many places to go with this. And I feel like maybe the greatest inroads could be technology wise in putting things in the hands of students that are straightforward and simple to make screening coding decisions. That is independent from what they use to make decisions and how they submit their decisions. And I know there's a lot of meat resources out there to do this. But I feel like they tend to be targeted towards researchers, people with a lot of experience in making decisions. They have experience with technology. And these tools don't quite fit an undergraduate who is probably 5% committed to the course, realistic, right. So they have like five courses going at once. They have their professors forcing them to do screening decisions. I mean, what we have to do is streamline that whole process to make it as simple and straightforward as possible for them. So that's it. I like to end it at that. These are some of the lessons I learned. There's certainly more, more things that I've learned. But I'm doing a shout out here. I have many ideas on how to do this, but it'd be fantastic if I could collaborate and figure this out, crack this nut with many other people rather than me just doing this on my own. Yes, what I'm bringing forward is experience and expertise in not successfully implementing this strategy for screening studies. And so, again, there's a lot of room to grow. And I think I'll end it at that. Thank you for watching. Thanks Mark. That was a fantastic presentation and just really highlighted some of the challenges that I've even come across in my own experience, but I haven't actually nailed down to those, you know, three issues the way you have and I'm so glad that you brought it to the hackathon because this is, I think the perfect group to try to work towards those challenges so hopefully when we do the Q&A, there'll be lots of time to talk about that further. So now we're going to move on to our last presentation. I'd like to introduce David Hobby and Alexandra Banach-Brown. Both are at the Berlin Institute of Health and they'll be going next. I'm David Hobby from Camarades Berlin, based at the Quest Center of the Berlin Institute of Health. I'm going to discuss a web app that we have built for teaching meta-analysis and ARC. Camarades Berlin is a group located at the Charité Health Network in Berlin for the promotion of the benefits of systematic review and meta-analysis of animal studies. Camarades stands for collaborative approach to meta-analysis and review of animal data from experimental studies and is part of an international network of pre-clinical researchers. We do our own research and we also provide methodological support for researchers to perform robust, high quality reviews. One of the ways we do this is by Sorry everyone, it looks like we're just having a problem with the presentation on our end. It'll just be a minute while we sort that out. During the pandemic, the tech hurdles which have always been present when teaching this material have been exacerbated by the remote learning framework. Most of the participants in our workshops come from disciplines such as biomedical and medical research or epidemiology and are generally not familiar with programming languages or statistical programming software. In previous iterations of the course we would provide an R script and some CSV data files lay out the technical requirements in advance with the expectation that all required software be installed and functional on session day. However, some subset of the participants would often arrive without having read the requirements or installing R or the required packages or without a date or conflicting packages. This often turned the first hour or hours of the course into a troubleshooting session where we'd have to go around and figure out what was going on with each individual person who was having issues. In light of this issue, we had the idea to build a self-contained web app based on the Learn R package, which walks the user through the steps to recreate the analyses of a published meta-analysis of animal data in the biomedical field involving a comparison of controlled intervention studies affecting infarct volume. This publication used all of the medical and analytical methods which we would like to teach, namely random and fixed effect models, meta-regressions and visualizations of heterogeneity with study design characteristics, forest and bubble plots. The two R meta-analysis packages which we used, meta and metaphor, contain the necessary functions for performing these analyses. In order to present these analyses in a useful way, we also used two additional packages. These packages were Shiny and Learn R. Shiny is a package for the creation of interactive web apps in R, which when combined with Learn R allows R markdown documents to be converted into interactive tutorials with live code exercises. Here, for example, is one code chunk taken from section one of our app where the user is asked to examine a data set that was just loaded. When you click Run Code, the code is executed and the output is displayed in the browser. After each major section, we also have questions with quizzes to check for understanding with immediate feedback. So if you get the wrong answer, you can try again until you get it right. In section five, participants are asked to create their own code based on what they have learned from the preceding sections in order to answer a set of questions. If they get stuck, the hint button will point them in the right direction. Other features that can be implemented in Learn R and Shiny are embedded videos and interactive Shiny elements. We actually didn't use either of these in the current iteration of the app, but it could be possible in the future. Using a teaching app allows us to sidestep any software-associated problems that individual students may have so we can focus on the actual material and on the principles of meta-analysis. The app runs on a browser window, which means that from the student's perspective, it is an interactive website. The course is presented as a step-by-step tutorial with interactive code exercises which walk the student from loading the data through to performing the final analyses. At each step, there are short quizzes to check comprehension. Progress is saved as the user progresses through the app so that they can leave and come back whenever they want. All code is run on our own server and we make sure that all packages are up to date to ensure full functionality. This does, however, present its own issues on our side. The biggest issue we've had is the performance. Our courses have between 10 and 35 participants, which means an instance of R running on our server for each participant. And even with lower numbers of participants, we have had slowdowns and crashes when everyone is running analyses at once. Secondly, some participants reported that the user interface is still quite unintuitive and could be streamlined. One of the sections in particular is very long and is difficult to tell how far you are through it. We could probably break this up to make it more intuitive. We welcome other feedback about the app in general and the user experience in order to improve it. The biggest benefit and the primary goal of using this approach is that we have had zero technical issues on the participant's side since we have been using the app. The lowered barrier of entry for those students without prior experience or exposure to programming languages and we have received generally positive feedback from our students about it with the majority reporting that it was easy to use for the purposes of our tutorial. We provide access to the tutorial 24-7 on our website, which means participants can access the tutorial at any time and review the material. It is accessible to anyone else who may have interest. It was developed collaboratively via GitHub using free and open source software. And lastly, it has been shared with other camarades locations. Our colleagues at Edinburgh are also using the app in their tutorials and they are able to run it on their own server after downloading it from GitHub. So the app is not yet in its final form. The next steps are firstly to optimize server usage to improve performance when many users are running it. We have already optimized caching and we are investigating methods of parallelization. This will reduce and hopefully prevent the slowdowns and crashes that we have experienced so far. Secondly, we are going to clean up the user interface to make it easier and more intuitive to walk through. The one very long section that I mentioned earlier could easily be broken up into at least two other sections as a starting point. We don't want to remove any material. We believe it is important to present the full analysis from the published paper, but we could definitely arrange it in a more digestible way. Lastly, we would like to refine the questions and provide more in-depth feedback as to why answers are correct or incorrect. Thank you very much for your attention. I would like to thank everyone on the Kamaradis Berlin team and would like to thank our funders at the Cherite 3R department, as well as the developers of Meta, Meta4, Shiny and LearnR packages for making this project possible. And again, we welcome any feedback regarding the app or the experience and I'm looking forward to the discussion and questions coming through on Twitter. Thank you David and Alexandra for that wonderful presentation. And you know, I'm delighted to say that we have all the presenters here for live questions, so please send your questions through on Twitter. If the presenters could turn on their videos and microphones, we'd love to see your faces. I think I'm just going to get questions started for Mark, it's from Neil Hadaway. This is actually a question that I had a similar kind of question about crowdsourcing. And so the question is, should we be gamifying the screening experience for crowds that are involved in reviews, as opposed to kind of coordinators managing the projects? Do you think that would be a good solution? Exactly what I'm up to is really it's gamifying in that, you know, you're using a bunch of non-experts to make decisions on what to include, exclude. And I say maybe it's possible, like I have not succeeded with it. The idea is to get a lot of eyes on a single study. And like the dual screening, you just have two people making decisions that's totally inadequate. I mean, as far as I could tell, 20 plus kind of washes out the few screeners, the few reviewers that really have no clue what decisions they're making. And it's just, should we do this? I think if we could process, if you have the capacity to have many people screen a single study, then I think it's probably okay. The challenge is like you're wasting a lot of effort there. And that not a single study gets a lot of attention, but if you have thousands, then it's a distribution problem on making screening decisions. Yeah, yeah, great point there about resource waste, right? Wolfgang, I see your hand up and then we'll go to Alex Andrer as well. Yes, so Mark, very interesting. I really like this idea. So in meta-analysis we often worry about dependencies these days, right? Big topic. So how do you avoid that students are sort of getting together and doing the coding together and then you get dependent codes? Well, okay, so every student gets a random collection of studies. And so maybe two students at 20 students out of 100 class will get the same study. But how do you coordinate that, right? And so maybe, well, I do have an open discussion board for difficult situations and students typically vote to make a decision on whether or not to include, exclude. But yeah, you're right. So there is, there is that challenge. But as far as I know, it doesn't really happen because, again, students have like 30 titles abstracts to screen. It's a totally random collection from the sample population to screen. And so they don't have, like in a dual screening design, you have two people with the same collection. So they can coordinate. But when it's complete random sample, then coordinating for individual studies I think would be really tough for them to manage. Although, who knows. Alexandra? Thanks, Mark. I just wanted to share some experience that we've had with crowdsource meta research projects on work that that's been published. We have used free training. So reviewers that want to be involved in the project need to pass some level of training before to check comprehension of things like the Pico or or that kind of thing. A platform like learn our could be used to do that the learn our functions. Yeah, it can record how many people have passed not passed you're not able to move to the next section until you do pass it things like that. And you can also download it as an instructor you can download class results and individual student results as well. I think it's really interesting your dilemma about kind of randomly presenting articles or not for screening. And one of the software that we use randomly presents the articles for title and abstract screening. We have it set at two kind of agreed decisions. So that's whether there's three people involved or whether two people initially kind of agree already. And those just kind of get presented in random order. You could also increase it to say that, you know, five people need to agree in order to make that decision. And to that paper included or excluded playing around with. Yeah, how many screeners is required for a paper to be included. Yeah, the challenge I face is, you know, because things aren't a pair wise anymore. You instead of it being like a binary include exclude. Now I have like a proportion. Like a range from, you know, minus one to one. And that's what basically this cloud is right is like 20 students on this individual study right here. On average, made a decision to include it. And so what I do is like I kind of convert this into space if you occupy this space here, then the study gets included. And if you occupy this space, the study gets excluded. And if you occupy either of these competing or complimentary spaces, the whole studies get reviewed again, which again, totally inefficient. Sounds like a really complex algorithm. We usually work with three screeners and then the agreement has to be above 66%, but with 20 plus screeners that seems like there's the even you even weirder thing that I did not talk about is the way this is done is the number of screeners assigned per study is also random. So you could have up to 10 to 50 students screen the same study. Despite the challenge I feel like I need to like collaborate and harness some of your student power because it would be fantastic to have this this body of students that are willing to do this work. That's why I get excited every semester I'm like holy smokes folks, let's do some science let's push push things forward but then you know not everyone is motivated not everyone is like committed. You know everyone's exciting at the beginning of the semester but when the semester is near ending and they have to worry about exams and all that stuff they really don't want to be spending time screening studies. That makes a lot of sense. Oh, Alexander here. Sorry, I just have another question on that point. I wonder how much. I mean I don't know how you set up your course about how much this contributes to to class credit, but something that's worked really well for us in crowdsource projects in the past is having a leaderboard. That's like updated live so who's done the most screening who's contributed you know done contributed and people are naturally really competitive. So that works really well for us. I do that when we reach the stage of finding research articles where it ends up being just a giant competition or who could find PDFs of certain things. And if something does not get done, like a student does not submit their papers on time, then that effort gets distributed as bonus points for the class as a bounty to get things done. Yeah, I love all this stuff again but it's to me the implementation has always been the real hiccup Wolfgang I saw your hand raised as well. Yeah, I would like to address David's presentation for for the moment because I think this is also absolutely fantastic for for about a year or I've had on my to do list for metaphor create a swirl tutorial or learn a tutorial. So thank you. I can scratch one thing off my list. And really it's it's it's really absolutely fantastic. I was wondering how easy is it to take what you have set up and sort of plug in a different data set into into sort of the learner tutorial you have created. That's my first question. And the other one. I believe at the moment you're not using this in terms of grading or anything like this in terms of quizzes is really like a teaching tour, but have you also considered making this part of like a grading type of system where you can also give people quizzes that they're graded on. Okay, yeah. Yeah, regarding your first question. It shouldn't be too difficult to switch the data set it only took us a matter of days to get this up and running from the previous script that we had. Yeah, so I could see it being quite straightforward to switch it over. And I have a question actually for both Wolfgang and and David just you know in terms of this is from Neil in terms of thinking about you know the resources that that you both have provided to the community. Do you think there's a way that we could combine this work to you know really vastly scale training and evidence and this is kind of globally or how could we maybe do that. Because these are great resources right. But it's thinking about that pipeline about how to. Yeah, so I'll let either of you answer. Well, I mean if I if I may I can, I mean if you really want to sort of think about so if it's not so difficult to to sort of swap out the data set. I would almost like sort of imagine a system where the user can choose the data set out of the data sets for example a metadata and dynamically the the loaner to toll is then created using that particular data set. I mean that would take quite some effort to really like automate this process. There are other types of issues and methods that you can sort of illustrate with the different data sets will differ. But I mean that would be that would be like fantastic right like choose one of these data sets that you are really interested in learning more about right that becomes more motivating if it's a data set that you are actually really interested in. And automatically a sort of learner tutorial is created on the fly tuned to that data set. I mean that would be amazing. Yeah, that would be fantastic I see Alexander you have your. Yeah, no that would be really cool. I can definitely see. Yeah, many potential avenues for collaboration in the future creating a modular shiny app where these learner components are integrated so that you know, depending on what the effect measure is depending on what the heterogeneity. Investigation is like subgroup or meta reg those kinds of things that that could be really cool and to make it to make it more accessible. I think our app is obviously for teaching preclinical animal systematic review. So a lot of the information is is yeah specific to that. But yeah we use a published data set as an example so people can recreate the the analysis from that. On that note Wolfgang I wanted to ask if we could add a new effect measure to your metaphor package the normalized mean difference. Then we wouldn't need to hard coded ourselves and we could use it straight from your package but we can talk about more that more later. Well I'll just get in touch. I mean, happy to to hear what you have in mind. Thanks everybody I'm just double checking about any. Any questions that are coming in through the through the chat. So here I had had a question regarding use of pulling data sets into metaphor automatically if if if we would ever be able to kind of reach out like data that's published from preprint servers and kind of into this system. I don't know if that's a possibility. Maybe a question for Wolfgang but Marcus shaking his head. Yes, so you're saying that would be ideal. I don't I should not answer this. I don't think I could do that but Wolfgang might. I mean you can if you have a CSV file or whatever lying somewhere on OSF I mean you can that's one command to read it in. And, but, but if you want a data set properly in metadata you really need to document the data set right so yes that's really the the the kind of the more tedious thing that you have to really create a code book for every variable in the data set but but that's the that's the sort of formality that we really want we want well documented well described data sets. Right so it's easy to just read in some data but then you're looking at 50 variables that you have no idea what they mean. So it's a little bit of extra effort involved. And the next step from that is having these file formats for meta analysis data sets as a standard for any publishing. So yeah future future goals that there is this documentation and metadata for all data sets so that we can use the same code scripts for those dreams. So when you integrate a data set in your package you're the you're the one who's doing all the consistency and formatting right, or you get help right. I hope people supply really the data set the data preparation script that turns that into this our data file and this help file right. If even if you don't know how to write our help files there's the product that function that sets up the template, it even looks at the data set figure figures out what are the variables in the data set. Are they numeric other character variables with sets that all up for you. And then you just need to complete the description of each variable and of course the title. And you need to add the reference and so on. But I mean that's just copy paste from another example and then just adjust right I mean it's, it's not super complicated. But yeah I do or my collaborators we do want to take a look at at these submissions right so to make sure that they are well described and clear and so. But yeah, we will be happy to work with you on that if you have a data set and you just need help getting it into metadata just get in touch and and we can, we can get you started. That's a great offer and I think one of the things that that I find so helpful about your your work. I mean, I could actually see on you know someone like Mark maybe having if he had a group of students who he felt like we're a little bit more advanced, starting to use your metadata like creating a data file from that systematic structure and then using that as a teachable like okay maybe you got it wrong here but this is actually these are the elements that we're looking for to have kind of transparent and open science and this is why this is important so all these tools I see kind of fitting together to help you know develop this larger evidence synthesis. Yeah that's like a really important learning outcome is annotating data sets. And that's why for reuse. It's surprising how many things are floating out there with little or no information and you're kind of left to figure it out. And that's why this is so important to have these packages because that is has been filtered reviewed in a consistent way. I mean it's absolutely fantastic. Alex I thought I saw your hand go up and maybe I was a minute ago maybe I was wrong. I'm not sure that the questions coming in from Twitter were also being. Are there more questions from Twitter at the moment or do we have time. I don't see any here. Okay. Yeah I just wanted to ask Wolfgang and Mark if you your experiences with teaching are and I guess are packages to. To a wide range of audiences. Is there kind of help material that you feel is better for students at different stages or. Yeah how can we. I guess optimize the learner experience for either people that are new to like we want to teach meta analysis but they're new to are or they know are but they're new to meta analysis. Do you have any experience of if those groups of students maybe approach things in a different way. Good question. Yeah that's that's that's a tough one so I mean I teach a lot of meta analysis courses right but. And there I do not really assume that people have a lot of our experience so I focus on just sort of the minimal part about unnecessary to really get started with meta analysis of course having some. background with our super helpful. And that tends to be actually sort of the audience that I have so. People who really do not have a lot of our background and then just really want to learn how to to run the meta analysis with our. And yeah well there you you just have to in my experience what you have to do then is really minimize the our code right like. So one of the for example like it was with metaphor you can create forest plots but if you want that forest plot to look nice it's it's like three pages of code right I mean. I openly admit that this is how it is in metaphor right and so okay maybe so I don't focus on that because it's just going to be total. Overload on all these nitty gritty details of how to create nice figures with our right so I sort of show them the basic one and and just show them in principle yes you can customize this to not look so ugly. But yeah you have to you have to tune your teaching to like the audience that's super important. But yeah beyond that it's hard for me to say like yeah if you have people who are super experienced with our like how I would approach that. I don't know if anybody else has any ideas. I've tried to do it. Where the there's like multiple learning outcomes in the course where you start learning are. Then you learn stats. Then you do the two and then you move to meta analysis. And I feel like nobody walks away knowing anything with that maybe. Maybe they could learn maybe they don't know how to tinker with our but it's just too much all at once and if you could separate those learning outcomes I think yeah. Students motivations increase because they don't have so many giant benchmarks or hurdles to learn to achieve to understand the next level. And I had I again, I say that I totally I'm very unsuccessful in doing that and I try to keep those things isolated. So, I wish it was like this universal course you could take for you learn stats you learn are you learn meta analysis you learn reporting. But that's just a lot. That's like a multi year course. Maybe it probably would be. I don't know. I'm going to have to end the session we're out of time, but I want to thank all our panelists presenters. Thanks for a great discussion and for sharing your, your really valuable work with with us. And with the larger audience. I want to remind everyone that the session closing the conference closing today will be at 2pm GMT so it's in about a half hour so we look forward to seeing you all then and and thanks for participating.