 I'm going to talk today about the Meta-P package. This package, which is available from C-RAN, does the meta-analysis of significance values, also known as P-values. This is useful if, in fact, you don't have effect sizes. For instance, you have a number of old studies which just report a P-value, or the primary studies have reported incompatible tests. If you do have effect sizes then obviously using them would be preferable. And there are a number of packages on C-RAN which enable you to do this, perhaps the most popular to our metaphor and meta. What methods do we have available in Meta-P? Here I show the various methods, the eponym for it in normal type, after the name of the function in Meta-P in parenthesis. There is a group of methods using some transformation of the P-values. They rely either on the inverse chi-squared distribution, inverse normal distribution, inverse students T, or the inverse logic. And there are also a number of methods that work directly with the P-values, either with the sum or the mean of them, or just using a single selected P-value. I think of these the most well known methods are Fisher's method, which relies on the sum of logs, which is, in fact, an inverse, with two degrees of freedom and inverse chi-squared, Stouffer's method, and possibly Tippitt's method, the minimum P, although that's not usually, not often known by that name. Now how do you use them? Well, all the functions have as their first parameter a vector of the P-values, and they have some additional parameters. They all return an object of class Meta-P for which there is a plot method, and a function-specific class which supports a print method. The example shows you loading the package, extracting the particular data set from the collection of data sets provided. It's one set of teacher expectancy ratings, and then using Stouffer's method on that, which is sum Z. That returns the relevant statistic and also the P-value. As you can see, this rejects the null hypothesis. You'll see the values of P for the teacher expect when we come on to the graphical display section. I said that show that it was rejecting the null hypothesis. Well, what is the null hypothesis? Well, it's well defined. It's that all of the P-values, the P-i, have a uniform distribution on the unit interval. However, there are two classes of alternative hypothesis. One, that all of the P-i have the same unknown, non-uniform, non-increasing density, and the second, that at least one, possibly more as an unknown, non-uniform, non-increasing density. If all of the tests being combined come from what are basically replicates, then the first one of these is appropriate. If they're different kinds of tests or different conditions, then the second is appropriate. It's also appropriate when you imagine that the signal in the data will only be located in one or two of the primary studies which you're combining. Now, does it make a difference which method you choose? Well, yes, of course it does. Otherwise, one would wonder why on earth we have so many different methods. One feature of the methods is that they don't all behave in the same way. If you have a number of values significant in both directions. As a very artificial example, we take these four one-tailed P-values. Two of the methods, fissures and tippets method say reject the null hypothesis because of the two very small ones. But most of the others suggest that in fact this is neutral as to whether you should reject or not and return an overall P-value of 0.5. The maximum P-method which isn't very widely used gives the inverse to the minimum P-method as you might expect. Another way in which we can look at these is what happens if all of the P-values you had were equal. What happens for fixed P-sub-i if we increase K? Well, if it's above a function-specific limit the return value tends to 0. If it's below that then it tends to 1. The picture for the maximum P is that we show here the minimum P-value which for constant PI rapidly tends to 1. As K gets larger and larger. Below it the logic P and the mean P and some others as well show the return value being 0.5 if all the PI are 0.5. If it's above that then it tends to 1. If below it tends more or less rapidly to 0. The fissure is an unusual case but the specific limit is not 0.5 it's 0.3679 approximately. The mean Z-method has the rather strange and undesirable property that for equal PI-sub-i it always returns either 0 or 1 and never an intermediate value. It also can have some other rather strange properties. This is because it relies on the deviation of the PI to get the significance value and if they're all very close together then you're dividing by a very small number and vice versa. Now Loughan carried out a simulation study and although he didn't include all of the methods which are in metopey his suggestion was after you've considered the cancellation issue because obviously you don't want a method that cancels if that isn't what you wanted to do and similarly you don't want a method that does if you don't. He suggests that if the emphasis is to be placed on the very small p-values then you want to use Fischer's method or Tippitt's method. If your emphasis is on the large p-values then you want to use Eddington's method or the maximum p-method. If you are interested equally in all sizes of p then you should use Stouffer's method or the logic p-method. He in fact recommends overall logic p is probably as good as anything. Now I mentioned that the package provides some graphics and this is a QQ plot of the data you've seen used in the example earlier. And using a QQ C RAM package I use their function for providing a confidence interval about QQ plot and this is a simultaneous confidence interval. If any of the points falls outside the ellipse then you can reject the null. In this case it's actually quite hard to see this maybe but right down in the bottom left hand corner they're very close to the edge of the ellipse. Are they on it? Are they inside? Are they outside? It's quite hard to tell. However fortunately the authors of that package provide another option for plotting the scales on a log scale. The direction of axes here is reversed so the small p-values have now gone up into the top right hand corner. It's now much clearer to see what's happening there and one of them is well outside the QQ plot. The advantage of either of these plots is that you can tell and see where the signal lies in the range of the p-values. Is it the case that all of the p-values are tending towards lying outside the confidence ellipse or is it just a few? Another plot which I provide is the albatross plot so-called. This plots on the y-axis the transformation of the sample size on the x-axis the transform p-values and the contours here show constant effect sizes. In this case I've chosen standardised mean differences. In this particular example there are three different sorts of trials being considered here and they're labelled in the plot A, B and C and I think you can see that on the whole the A's lie rather towards the right hand side and the C's rather towards the left hand side. There's some evidence certainly that the trials are of different types. This plot's also been suggested by its authors that may be useful if you have the situation which does often arise where in your primary studies for some of them you've got effect sizes for others you've only got the p-values. In that case if you plot all of them and mark two different sets with different symbols you can see whether the ones only with p-values are broadly compatible with the ones where you have the effect sizes. I should say that despite the name that doesn't look to me remotely like any albatross I've ever seen but that's the name the author gave it. There is stator code available for this but using some advice from him in his blog I've programmed it in R. Just to wrap up the albatross plot is suggested by Harrison and colleagues. The methods, the truncated fissure methods which I haven't discussed today our wrapper is a wrapper for two functions one from CRAM package T Fisher and another MUTOS. There are other packages available from CRAM particularly about the PULAR package which provides methods for correlated p-values and there are a number of other methods which I haven't covered here but which are available and for details of those see the CRAM task view on meta-analysis. Citations for everything I've suggested and the data sets and the full documentation are available either in the package manual or in the vignette so I'm not going to put them separately in the slides.