 Hello. My name is Simon Thornley, and I'll be talking about a package that I wrote several years ago called PubWires, a method that considers a test for an excess of significant findings in a meta-analysis. This was the inspiration for the package, a paper written by John Ioannides and Thomas Tricolinus, several years ago, back in 2007. In this tutorial, we will look at what PubWires is all about, what is it looking for, and how to implement it. And this is the library that we'll be using. Publication bias, also known as the file draw problem, where people don't like the results of their trial, and instead of publishing, they leave them in the file draw. Let's just briefly review the problem of publication bias. Basically, it means that what's published is not equal to what has been studied. If you think about it, it's chasing objective evidence of subjectivity in the scientific literature. And we have problems with publication bias in that there's usually only a small number of trials in a meta-analysis, and so we have low statistical power. The standard approaches can be insensitive, like the eager test in the funnel plot. And often they're based on an underlying assumption that it's really the small studies that are likely to be more biased and prone to publication bias than the larger studies. But what happens if the mechanism is slightly different? What if it's all about the p-value? And what if research is really selecting for significant results? Can we look for statistical evidence of selection of a study based on p-values? Well, this library helps us. So let's just think about some of the setup required. Because I've let the package slip, you need a few more lines of code to install it than would otherwise be necessary. It's still working, okay. So let's just think about an example, I think, is the best way to explain publication bias. Several years ago, I was looking at the effect of statins on low-risk populations, primary prevention on mortality. And I noticed that some meta-analyses were positive, showed a beneficial effect of statins, and some were negative. And I thought, well, I wonder if publication bias is an issue here. There was conflicting conclusions and also small pulled measures of association. This is using the DMETA package and the META package and doing a conventional analysis on some statin trials looking at overall mortality. So let's run this code and produce a funnel plot and finally do an IGAS test. And here we have a slightly truncated output. We have each trial, this is replicating the revman output. We have the suggestion of a small benefit with the odds ratio, pulled odds ratio 0.89, suggesting roughly a 10% reduction in overall mortality. Not too much heterogeneity indicated by the I-squared. This is a fixed effect analysis. So the weights are the inverse of the variance of each study. And here where we plot the odds ratio of each trial against its standard era, we see a funnel plot like this, which shows no major evidence of publication bias. Perhaps a little suggestion that there's more studies down this end than there are over on the right-hand side. And here we see that if there is truly a null association, then some of these studies are on the borderline of the p-value threshold for significance, which may be related to including them in a publication. But here's an IGAS test, which is a regression of a normalized odds ratio against the standard era for each trial, and no deviation, no convincing deviation from what's expected. So no major evidence of publication bias. We get a warning that there's a small number of trials. It's very common in meta-analysis. So what does publication bias do? Well, basically it looks for whether the output and the inputs to a meta-analysis are coherent. So let's just have a little look at the data. We've got a number of events in the control group, sample size in the control, sample size in the treated. That's all we need for a meta-analysis. These are all the trials there. So what does publication bias actually do? Well, we assume that the summary measure of the effect is true, the pulled odds ratio or risk ratio. And then we estimate the observed number of positive trials for a given level of statistical significance. So O for observed sub-alpha using a two-sided Fisher test. Then we estimate the expected number of positive trials for a given level of significance, which is E sub-alpha. And how do we do this? Well, we assume that the pulled odds ratio or risk ratio is true. We simulate the number of outcomes in the treated and the untreated using binomial sampling. And then we estimate the proportion of the simulated studies that are significant by a two-sided Fisher test, which gives us our statistical power. In the untreated group, the binomial proportion is straight forward. It's just a number of outcomes. In the untreated group, the binomial proportion is straight forward. It's just what we observe in terms of the event proportion and the untreated. In the treated group, the binomial proportion is not estimated from the observed data, but rather is inferred from the event rate and the untreated and the assumed common risk ratio. Finally, we test for significance between the observed and the expected using a two-sided chi-square. And we use a liberal 10% level of significance. So, here's some of the code, and basically what we do is with our data frame of our individual trial results, we use this function plotChaseObservedExpected and we have various inputs. And let's see what the results are. I've used a very small number of simulations, and this normally would use 10,000, but just in the interest of time, I kept it short. So, what do we see? Well, this is a summary of the results, and here we have the solid black line, which has the steps, is the observed number of positive trials. So, at very low levels of significance, none significant, then we jump up to two, and then as we increase the significance level alpha, jump up to four, and then finally to five. Now, this is the expected number of positive trials, and we find that as we increase alpha, the number of positive trials slowly increases as well. What's of interest here is the difference between the observed and the expected, and we see that the largest evidence for difference is around this 0.05.1 region, where the difference is statistically significant, less than the 5% threshold indicated by the horizontal red dashed line. This is some evidence of publication bias or chasing significance in this meta-analysis of the use of statins for preventing mortality, at all-cause mortality, as it happens here. So, the grey line here, the grey dashed line, is the p-value for significance. Now, is there criticism of this method that hasn't been widely adopted? Some say, well, isn't it just circular reasoning? Well, yes, it is examining the coherence between the input and output of a meta-analysis, but if we think about it, what would be the influence of this circular reasoning? If there was evidence of p-value hacking, then one would expect that the summary odds ratio would be biased toward an exaggerated effect of the treatment. And this would then increase the number of expected studies that are positive or significant compared to the observed. So, we think it's likely to the method and the presence of a bias is likely to be exaggerated in terms of towards the nail, rather than towards a significant difference. There's some various criticisms that could be leveled at the library, and yes, I'm aware of these. So, what's my conclusion? This is a useful adjunct to the standard methods for looking for suspected publication bias, particularly when p-hacking is suspected. And this may be when the effect size in a meta-analysis is small, or different meta-analyses on the same subject come to different conclusions, such as was the case with me. Thank you very much. Notice my emails just down here. I'm very happy to take questions about pub bias. Bye for now.