 Hi and welcome to EsmerConf 2023, the evidence synthesis and meta analysis in our conference. We're delighted to have you here with us this week. We've got an incredibly exciting and packed schedule full of workshops, tutorials, panel discussions and presentations. We hope that you're going to enjoy it as much as we have enjoyed getting it ready for you. If you want to know more about what's on, you can check out the program at www.esmerconf.org. And you can follow us on Twitter at eshackathon. So before we get stuck into the conference program itself, I wanted to give you a bit of a background into EsmerConf and a little bit of our history. We established EsmerConf in 2020 and since then we've run two events, EsmerConf 2021 where we had a little over 500 registrants and EsmerConf 2022 last year where a few over 850 people registered. The aims of EsmerConf are to build a community of practice on the use of R for evidence synthesis and meta analysis, to support the development of and showcase novel tools and frameworks for evidence synthesis and meta analysis in R, to build capacity for the use of R in evidence synthesis and meta analysis and to raise awareness for the need for rigor in evidence synthesis and meta analysis. And so although there is a focus on R in a lot of what we do, many of the tools that we showcase are provided in a way that doesn't require knowledge or previous experience of R and many of the training events that we provide are not specific to R, they're specific to evidence synthesis and meta analysis methods. So it's really meant to be a very welcoming community where anyone with any experience of evidence synthesis or R can learn a bit more in a safe and welcoming environment. So we have this week presentations on packages that are designed to assist reviewers across evidence synthesis stages from planning to communication. We have demonstrations that integrate evidence synthesis packages into an interoperable pipeline in R. We've got novel applications of R packages in an evidence synthesis context. We have reports of people automating evidence synthesis methods in R and assisting novices to R in performing evidence synthesis with the aid of graphical user interfaces. So although, as you see and I mentioned before, it is an R focus, there is also a focus on trying to make the tools that are available in R available to people who don't have a background in it. We also have a range of training workshops that are provided throughout the week. And we do have the intent to start some hackathons. We don't have any hackathon projects running live this week. We did last year, but this year we're hoping to start off some hackathons. So keep an eye out for the conference program and you will see opportunities available to dive in and join a hackathon, whether it's one specific to producing an R package or a shiny package, or something that isn't about R, excuse me, that's more about hacking an idea or a framework. So don't worry if you're not an expert in R or a coder in R, some of those opportunities might arise during the week that would be relevant for you. I also wanted to say thank you before we get started to the organizing team who've helped to make this event a success. And indeed to everybody who's provided a presentation, a recording or running a workshop. It's a huge amount of work that's gone into this event and I'm personally very thankful for everybody's efforts. Our host as always is the evidence synthesis hackathon. And I established the evidence synthesis hackathon, my name is Neil Hadaway, with Martin Westgate in 2017. And the aim of the evidence synthesis hackathon was to host hackathon events to help develop frameworks and tools. So far we have a little over 30 projects that have developed under the auspices of the evidence synthesis hackathon. And we also run the annual ESMACONF, which focuses on training, showcasing and collaboration. We have a growing library of tools that includes some things you may be familiar with like predictor, RobViz, Prisma 2020, citation chaser, metadata and Evie Atlas. But if you want to check out more then please visit www.esmacathon.org. I also wanted to give a shout out to our funders. This year we're still supported by Code for Science and Society, which gave a very generous body of funding last year to support ESMACONF 2022 and we are still being supported this year. And the funding that they provided helped to provide bursaries for caregivers people with resource constraints and also funding for having subtitles verified. So thanks to CSNS for that. I also wanted to thank our donors and people who've registered for the conference. Voluntary registrations have provided a huge amount of funding this year and we're looking at now being self-sustainable. So this is a really exciting development and thank you all for registering and for donating throughout the year as well. I also wanted to give some important notices before the conference properly starts. We have an accessibility policy and under that policy we ensure that ESMACONF is fully online that you can watch it live and in catch up. It's worth noting that maybe 10% of people watch live 90% of people in our records are watching and catch up. So whether you're live with us today or watching at another date perhaps later during this week or later during the year, you're very welcome and we hope that the ability to watch in your own time and at your own pace makes the event more accessible to you. We do use English as our primary spoken language but all of our recordings are subtitled and those subtitles have been verified manually which means they're auto-translatable. So far this seems to work pretty well. If you click in YouTube on the closed captions icon, the CC icon, you can select subtitles and then you can auto translate and select a language to translate into if you would like to see subtitles in language other than English. And under this policy, we always include costs for translation and where possible signing as well. And we prioritize these in our grant applications that we submit to try to generate new funding for ESMACONF. Unfortunately this year we don't have funding available for signing services but we hope that the subtitling goes some way to make the event as accessible as possible. You can provide feedback about the conference in any format that you prefer. We do have a feedback form for the conference so look out for that on the ESMACONF website if you'd like to give us feedback and you can send us a DM on Twitter. You can contact us on Slack if you're registered or you can send an email to one of the conference organizers and details of who we are is available on the website. So some also some important notices about our code of conduct. We have several commitments that people will be treated with dignity and respect regardless of age, disability, gender reassignment, marriage, civil partnership, pregnancy or maternity, race, religion or belief, sex or sexual orientation. At all times people's feelings will be valued and respected, language or humour that people find offensive won't be used, for example sexist or racist jokes or terminology which is derogatory to someone with a disability. No one will be harassed, abused or intimidated on the grounds of his or her race, nationality, gender, sexual orientation, gender reassignment, disability or age. Incidents of harassment will be taken seriously. And if you want to raise a concern or a complaint you can contact the organizer of evidence synthesis hackathon which is me or Martin Westgate or Esmar Comf, if you feel you've been treated unfairly on the grounds of a protected characteristic and you're encouraged to raise these concerns with an organizer which you can do anonymously if you like. You can email the conference organizer which is me either of those email addresses or you can use this anonymous form bit.ly slash Esmar Comf underscore feedback. We'll investigate all submissions thoroughly and you can find more details about our code of conduct at bit.ly slash Esmar Comf underscore access which includes our accessibility policy as well. So we do ask that if you're using Slack or Twitter if you're engaging with someone else from the conference or one of the organizing team you just bear this in mind and we try to maintain a really welcoming and positive community here. So I also wanted to give some details about the number of people and number of presentations and content that we had last year because it's now been over 12 months since our last event and I wanted to give you a bit of an update. So as I said before we had 863 registered participants last year. We had 29 presentations across eight special sessions and six workshops. As always it was 100% free and here's a link to our YouTube channel if you want to share that. Across the year we have had 7,200 views of last year's conference materials across 353 subscribers as well. We've had during the conference last year which was a four day period. We had 861 unique viewers totaling more than 3,000 unique video views and we had 142 new subscribers during that week as well. So welcome all of the returning subscribers to this year's Esmerconth and welcome to all of you new people as well. I just wanted to show as well the viewing figures over the last 12 months for Esmerconth immediately in the days following the conference and you can see that we really are having a continued interest level across the year in this content. So people are still engaging with the talks day after day, week after week and month after month. So thank you so much for engaging with it as participants and if you've engaged as a presenter we're incredibly grateful. The materials in the conference are clearly really interesting and really useful for people. So this is a really exciting and great thing to see. I also wanted to give out a shout out to our most popular video from last year which was a workshop on searching that Alison Bethel from the University of Exeter Medical School provided for us. And so far that to our workshop has had over 850 views which is corresponding to 12% of the total channel views from last year. So well done Alison, it really goes to show how content like this is really important and really vital. So these numbers are increasing and we're still really excited to see more people engaging and benefiting from this content. So thank you again. Moving on to this year, we have again 29 presentations this year. They're across six special sessions on planning, collaboration and review management. We've got two on searching and record management, two on quantitative synthesis and one on data visualization and communication. So you can see there that R isn't just about meta analysis and this conference definitely isn't just about quantitative synthesis. There's some really important, really interesting content there on that but it really goes to show how interested people are in all of these stages of evidence synthesis and meta analysis around a quantitative synthesis. So thanks so much for that interest and thanks for submitting your abstracts. We've also got an amazing 10 workshops. We have Metronalysis with R, which is already run by Wolfgang Wichtbauer, which is a satellite event that ran last week. And then as part of the core conference content, we've got a workshop on testing and adjusting for and reporting publication bias, one on wrangling large teams for research synthesis, one on reporting guidelines for transparency and evidence synthesis, one on testing automated de-duplication methods in evidence synthesis. We have one on network meta analysis using R package and net meta. There's one on screening studies for eligibility. We have two on GitHub, one introduction to GitHub and one advanced Git and GitHub. An introduction to R Shiny for those user interfaces and one on a feet framework for critical appraisal and appraisal tools in systematic reviews. So a really diverse set of workshops and really exciting content there thanks to the workshop organizers who've been able to provide that for you all. We also then have nine panel discussions. We have one on considerations around tools and information retrieval, tools for information retrieval, including text analysis, how we scale evidence synthesis education and capacity building, controlling for publication bias, building a community of practice, the benefits and challenges of taking part in a hackathon, stakeholder engagement and evidence synthesis, the role of rapid reviews in the R evidence synthesis ecosystem, a Q and A with coders and a session on barriers to open synthesis and how to remove them. Along with that, we have over 55, 10 to 20 minute tutorials. Each tutorial introduces a new R package. So this is an incredibly exciting content. You can dive into this as and when you want to learn more about how to use a particular package and they are interactive. There's often data and code available and these sort of video vignettes really aim to make as easy as possible for you to use these packages. And we're incredibly thankful for this enormous group of tool developers who've put these together for us. And you can see just some of them highlighted in this word cloud here. So how exactly is the conference working this week? Well, we've got workshops that are being held via Zoom which some people have registered for. These are live streamed to YouTube but the people who've registered via Zoom will be in a webinar and they'll be able to ask questions and get answered live. We then have special sessions which are being premiered live on YouTube. These feature a range of different talks about different things from evidence synthesis, meta-analysis and R. We've then had individual presentations that have been pre-recorded as well. So as well as being able to watch the sessions premiered live or in catch up you can watch each individual talk that's part of these sessions and you can dive in, do a deep dive into whichever talks in the content of the program you're really interested in. The tutorials are released in the morning and evening of each day, half a dozen or so tutorials each day to spread out across the week and you can watch these in catch up as and when you want. You can ask questions via a Twitter thread so each presentation will have its own Twitter thread and you can do this for example by checking out one of the threads here and then hitting the comment button to ask a question and then hopefully the presenter will be able to answer that question directly on Twitter or we'll be able to pass their answer on. If you're registered you have access to our exclusive Slack channel and you can ask questions and engage with other participants and presenters directly there. So that's it for the introduction to EsmerConf 2023. We've now got an incredibly exciting keynote by Shinichi Nakagawa and he's going to talk to us about the future of meta-analysis and his perspective as an evolutionary ecologist. Over to you Shinichi. Hello, my name is Shinichi Nakagawa and I like to thank organizers who invited me and also for organizing this awesome conference online. Today I'd like to tell you about future meta-analysis from my perspective as evolutionary ecologist or evolutionary biologist. Okay, first I'd like to acknowledge my lab members at the University of New South Wales in Sydney. Also, important course as my colleague Will Conwell and the colleague Caroline. Both used to be at UNSW but Holly moved on to University of Florida as a new assistant professor. Okay, so first I'd like to tell you about meta-analysis beyond the literature-based data and give you a really good example. And you may be thinking it's like individual participant data meta-analysis, IPD meta-analysis. It's kind of close to that but it's sort of like ideas goes beyond that. Okay, so citizen science is changing the way biologists correct biodiversity data and many pro-biological and evolutionary biologists have heard of GBFGBs, the Global Diversity Information Facility. This is a meta database, so database of databases. It has lots of different nodes across the world and it's collecting millions of observations of species data every day. And one of that is node, database is EVAD. And for EVAD, my kids are even contributing and probably among the listeners, there's many people contributing. This is not my kid, but you can contribute this through, you can get mobile app, EVAD mobile app, and you can put all the species you have seen by your birding trip. Important thing is you can put time and how much you walked about, how many people participated. And also most important, I want you to remember this is how many species you have seen and how many of individual species you have seen. Okay, so this is gonna be quite important bit later on. Based on the citizen dataset, we actually estimated number of species, not the number of species, number of species, number of birds in the world. It turns out to be around 50 billion, so little shy of 10 birds per person in the whole world. And yeah, done by Corey, me, myself, and well, and this is not only EVAD data, but some survey data to estimate not only the number of birds, but the number of birds per species using some imputation method, which is based on how easy to detect color, flock size, and body size, and the conservation status. And we were able to estimate this. This is pretty amazing to me. This was not possible without citizen science data, but this is not quite meta-analysis, which as you know it, and this is a global data analysis. So I will tell you about meta-analysis example from my lab. And about before I go to that, I need to tell you about the second law of macroecology. What it is, is abundance occupancy relationship. This is from Babak et al's paper, and the relationship is, so this is abundance, how many individuals you've seen, and how widespread the species are. So each data points are species. So idea of abundance occupancy relationship is widely distributed species are more abundant per unit space. And this is slightly counter-intuitive, or intuitive, I don't know, depending on a person, but this has a conservation and a fishing corner, some applied implication and sometimes it's used because if species widespread, they're abundant per unit, so you don't need to worry about those species. Also you can actually get more fish for that species. So that's an important relationship. And there's a reason why this is called the second law of macroecology because there has been a meta-analysis. This is the traditional literature based one. So they got nearly 300 effect sizes and the correlation was nearly 0.6. So Zi is just a fish's transformation of correlation coefficient. This is a final plot. You might be used to seeing it like a 90 degree this way, but what you can see is that this is zero and all the data points are sort of like, most dense one is around like this 0.6. And this was done quite a few years ago by Blackburn at home and that's that. That's why it's definitely by far the strongest relationship I've ever seen in ecology. It's by far, yes? This is why it's called the second law. However, if you read related to literature, there's a disregarded hypothesis. Easiest explanation of this relationship, abundance of expensive relationship is sampling bias. And the sampling bias hypothesis, even though disregarded or rejected in the literature, in the current literature, it states, why do we spread species that easy to observe because it's the most widespread? So if you go around in the survey in the area, you see this first, but if you actually exhaustively observe that one area, this relationship would disappear. And also maybe in the current literature, published literature, there might be publication bias. So people go on a survey in an area, maybe they're only publishing a strong correlation. So overall, what we saw in the meta-analysis, this might be biased or like this 0.6 seems too high. So here comes the citizen science data. How are we gonna utilize the citizen science data? Because there will be no publication bias if you use it all because people are not 10 citizens collecting millions of people are doing it, even they are not concerned about whether this is significant or not. Okay, how do we do this? So each of these data points, checklist and this millions of checklists here, and it's included nearly 8,000 species. And this is an example, three checklist in the US, Europe, and this is Australia. You can imagine the, so Corey goes out and birding and he saw feeders, 10 of them, that's a good day, and he observed about 30 different species and each of them how many. And we can calculate correlation between local abundance, not the global abundance, but the range side, because range side, you can get estimates from GB for E-bird and we can correlate. So each checklist, which I talked about, we can get the abundance information occupancy relationship we already know for these nearly 8,000 species. So each checklist, we can calculate correlation of this relationship, abundance occupancy relationship, and then we can aggregate using meta-analysis. And this is called the final plot, I get explained this bit later, more in the next slide actually. So it's a precision, so number of, in this case, species, higher number of species, you go and you put more effort, you get to this global mean. And we are expecting this to be around Zeta of 0.6 above. If there is a publication bias happening, it will be smaller. So that's the meta-analysis we conducted. And I'll show you the result, results is look like this, it's a bang on zero. So that's, we were not expecting this at all because this is a second low macroecology. We based this meta-analysis on the effect size of the nearly 17 million correlations. And this was based on the observation of three billion births, individual births. And the overall effect is almost zero, 0.015. Actually it turns out to be because it's, we have a 17, nearly 17 million effect size. And this turns out to be significant, but I get to that point later. This is almost meaningless significance. But it's very close to zero. What's most surprising is some of you are familiar with the I-square, I-square is extremely small. And this is probably smallest, one of the smallest I've seen an ecological meta-analysis because it's mixture of different places, species. What it is it's indicating is there's a lot of variation you see almost all variations due to difference in sample size. And as you can see, so precision here, as precision increases, number of sample size increases. So those are like a couple of hundreds, hundreds of species observed. Those are the, I think we excluded smallest one. So I think you have to have about 12 species at least or something like this. And if you are observing just 12 species, this correlation, I expected to vary due to the sampling error, yeah? That's why meta-analysis include this sampling error variance explicitly to account for this. But what's surprising is almost all variation we see is due to the sample size. This actually really indicates this relationship must be close to zero despite this the second law. And actually the original meta-analysis, Blackburn's meta-analysis conducted a failed number test to claim that more than half million unpublished narratives would be required to notify the defect of this magnitude for relation 0.6, but that's okay. We have, we got it covered because we have a 17 million not just a half million data sample size. It's interesting thing if you remember this sampling variance hypothesis, this is disregarded in the literature. However, so this is effort time. So the log one effort time is about three minutes. Log five, it's three hours. And as you can, this is, it's very hard to see this is 17 million data points. So if you are observing very little time actually, this effect appears a little bit, but it's completely goes to zero if you observe three hours. So actually this is a huge support for the sampling variance hypothesis which nullifies the second law of macro-ecology. So I'm wrapping up this part of the talk, future meta-analysis data integration. So putting different data together. So we already know literature based meta-analysis. Now that's lots of archived raw data or using raw data, this is the IPD meta-analysis individual participant data analysis. But now we can use citizen science data. Also you can actually put together different type of data such as climate data. And I quickly tell you the one example from our lab and this is a study by Sammy Burke at our lab and she collected, this is the disease frequency. So you probably heard about the poro which is affected by bleaching event. Not only bleaching event, they're affected by different diseases that's warring and she collected about 200 papers on the frequency of disease last several decades. And it shows its frequency is increasing. But not only that, she was able to collect temperature data for this different studies. It comes from all across different oceans. And she was able to also show temperature significantly correlates or predicts disease prevalence. So this is a percentage of the disease. So this is what I call, what I mean by data integration. So second part, big data and a meta analysis. Okay, there's a couple of different parts to it, but those are a bit shorter than first part. Can I talk to my computer science colleagues? They think like all meta analysis would be obsolete. Big data, we just gonna visualize, analyze big data directly. And I actually personally disagree and they also disagree with this view. Michael Chang and Susanna Jack, they are both meta analysts. And what they propose is, so you have a big data here, but rather than analyzing it one go, you can divide into chunk. Maybe that has like big data heterogeneous. So you can split by places, split by ear, split by different traits or all sorts of things you can split by. Then you can actually calculate effect size in each. You can meta analyze. So this is a called split analyze meta analysis approach. And we have certainly used this approach. And because of this approach, we were able to analyze big data. An example of big data we use is international mouse phenoptic consortium. You may not be familiar with this, but all data available online and it has more than 500 traits, 100,000 mice or both males females across 12 institution across the globe. And using this data set and the split analyze meta analyze method, we looked at the sexual dimorphism in not the mean traits, but the variability of the traits. So those distribution ways of the trait whether there's a sexual dimorphism or sex differences. And it indicates certainly yes. Another one we used this split analyze meta analyze method is we looked at sex differences automatically. And at this particular study I want to tell you about and what is the geometry? So let's say this is the female male. This is exactly same geometric relationship. And it's a log linear user relationship. Your body size increases. Your eye size increases. Your body size increases. You don't move as much. This is a wheel, they're running. So another one is this is actually geometric relationship to different, but the mean traits overall mean the same. Male and female has two groups groups have a different elementary. And this case means different and slopes are different. And there's another thing so we can measure. So differences mean differences, slope that's elementary and the difference of the visual variability. So actually those means and variability are done in different studies. Those are the sexual dimorphism relates to sexual dimorphism in the mean. Those are variability. So that was a fast paper. So we are most interested in the slope differences. So we used nearly 400 phenotypic traits for each. We got effect sizes. This is a split bit split by phenotypic traits. And we meta-analysed functional groups. So we conducted nine meta-analyses using nearly two million data points from many mice. And this is what it looks like. And the most important is the slope one. This is a meta-analysis of the absolute differences and male and female. And what you need to pay look at is if it's around zero, male and female is a similar, but if it significantly deviates from zero, these slopes are quite different. And you can see immunology is a lot more different between male and female also behavior. But I'll explain this is how to understand, explain the implication. So many cases, many traits, not all the traits, many traits slopes are different. So what does it mean? So this is a scenario, like this is the beautiful picture, you know, drawings done by Schmetz-Drobeniak, you saw him in the first, a second slide in acknowledgement. So male and female exactly the same and those three traits, you know, three traits among those many traits we looked at, fat tissue, retinal clearance rate, the metabolic rate, if they're exactly same size and on average, you can give the same dose of drugs, that's no problem, but that's not true. Usually female mice are smaller and the people often, how do you say, assume it's, you know, perfect scaling there. So you can scale those three different traits as well. Same rate as, you know, people assume the female small males. In such case, you can just give two pills rather than three pills. But our studies, aromatic difference between sex indicates you can't do that. Body size, same proportion scale, but those different traits, maybe relates to the drug metabolism, scale different between male and female, slopes are different. In such case, if you just use male scaling slope, you might give two pills, that's an overdose in female, it's bad for female. But if you understand the female specific slope, you'll be giving right amount of pills. So this, you know, the giving an overdosing or underdosing a female happens in mice and humans because when they test drugs, they're only using males or human male subject. And we need to change this. Yeah. So big data meta analysis, hopefully I convinced you this is really useful approach. And the last section I'd like to tell you about, so last two sections were about, you know, we can use all sorts of different data, not confined to the literature based data, but how we do meta analysis are changing as well. Just a couple of example, this paper came from evidence synthesis hackathon, that is a predecessor of this conference and the feature I was involved in. And there we talked about this new ecosystem for evidence synthesis currently. When we do meta analysis or evidence synthesis, there's empiricist and he has synthesis and there may be different people, the same people, but what's happening in the empiricist, some of them they don't publish. So all I thought is not translated to the primary research. And we were only able to synthesize primary research and it leads to biased view of the evidence synthesis because we are synthesizing biased evidence base, regardless it's systematic map or meta analysis or qualitative analysis. What we propose the future is not only just, we should make the empiricist or synthesis that involved, we should make a community on a topic, all people working because regardless of whether they publish or not, we can actually because they're all part of community, they can contribute to all the synthesis, primary literature regardless of their contributing that we can synthesize all the effort. And this will lead to the unbiased evidence base and we should make it all open, open data, open code and also you should use preference, then it's all open to the public and stakeholders. And finally, I quickly touch upon this hot topic that I wrote blog to our labs blog page and I use chat GPT to see whether I can use this for the title of abstract screening, not full text screening, but one topic I was able to get very, very good to result and I was really impressed. So I wrote this blog, but I think in the next five years, use of AI and in the evidence synthesis will increase its presence. And I wouldn't be surprised in the sort of near future, it can do the older screening and also all the extraction of moderator effect side might be difficult, but we could be surprised. So all those things are changing and it's really exciting future is coming, I think. So take home messages from my talk is it's really bright future by combining different type of data which I call the data integration. And the metanize have a critical role to play in the era of big data. It's really data rich era with few or little theory and then you can use metanize to generating theory. So this will keep our as meta analysis very busy. And so we talked about at the toward the end community-based synthesis and I will change the way we summarize evidence-based, that's pretty exciting. And finally, I would like to thank metanize I think everybody should do metanize is the conclusion of this talk. And thank you very much for listening. Thanks so much, Shinichi. That was a really interesting talk. I hope everybody enjoyed that as much as I did. So that's it for our opening session. We are really excited to have you here with us this week and we look forward to sharing the rest of our exciting program in the future. So thank you so much, Shinichi. Thank you so much for joining us and we look forward to seeing you again in the next episode of our exciting program over the next few days. If you want to know more about what's on this week, check out the program at esmarconf.org and follow us at ES Hackathon on Twitter for all the latest updates. Thanks very much.