 What I will present before the coffee break is in this file that you have access to, which is a section of chapter of the Green Book. At least this is from the previous edition, but this portion has not changed much. So I simply show you the file but I will refer to it only sparsely. During the presentation where I will work mostly on the blackboard. So statistical testing and statistical testing by permutation. I could of course assume that you all know about statistical testing and add only the permutation part, but I think it may be relevant to summarize the main ideas of test of significance. And these are the classical or Fisherian test of significance. I'm not talking about Bayesian approach where you don't really do test of hypothesis. And so we will proceed from that. Of course here the idea is to test an ecological hypothesis and we will see how it relates to a statistical test because the statistical hypothesis is derived from the ecological hypothesis but indirectly. So I will put on the board here the ingredients that you need to produce a good statistical test. Like if you want to produce a good soup you need some ingredients. Here too we need some ingredients. The first one that is mentioned here is that you need an all hypothesis called H0 and opposite to that we need alternative hypothesis called H1. Then I will mention that we need of course data. I should have put that at the top. We need data. We need the hypothesis. We will need the test statistic. We will need a significance level and then we will need the testing method. These I think are the main ingredients. I may add a few as we go. For the sake of the presentation I will consider the test of a correlation coefficient. Suppose that you have two variables measured over the same objects, the same sites and we simply want to compute the correlation between these two variables and we are interested in finding out if this is a real correlation or if it is a value that can be obtained by throwing numbers at random essentially. That is the idea of a test of significance. For the test of a correlation coefficient the null hypothesis in a classical statistical test is the opposite of the thing that you want to find as an ecologist, as a biologist or as a scientist in any other field. So it would be that the correlation is the null hypothesis is that the correlation is zero. Now the opposite, so this is actually what you hope not to find. If you have done all this work collecting data and all that it is because you are hoping to find the correlation. But the only thing that we can test in a test of significance is the null hypothesis. We cannot test what we really want to find. Now the opposite of that can be different things. It can be that the correlation is different from zero or that the correlation is larger than zero or that the correlation is smaller than zero. You see these are all ways of being counter to that. And we will see that it is not trivial the choice of the alternative hypothesis. But the only thing that we can test is this one. So in the case of a current, now I'm turning pages, you have here the alternative hypothesis described so you can read that if you want to be more clear about these notions but I assume you have all been exposed to test of significance. So when we choose this type of alternative hypothesis this will produce a two-tailed test for the correlation coefficient and these two here will produce one-tailed test when it comes to testing because we have more specific hypothesis here and we will see that it is worth the trouble of thinking about your ecological problem and deciding if you can choose one of these instead of that. If you don't want to think and just do the test automatically you will choose this sort of alternative hypothesis that is not precise at all but then we will see that there is a cost to that. There is something that you will lose in the meantime if you choose that. Okay, now we will need the test statistic, it must be described here and of course after we go through that we will show how to do permutation test. Here is actually the way you would do it for a normal parametric test of significance. For the test statistic for the correlation coefficient you have the choice between three correlation coefficients you can use the R statistic of Mr. Pearson, the R statistic of Mr. Spearman or the tau statistic of Mr. Kendall and in the R functions that compute correlation or chord test all three are equally available, it is just an option of the same R function so that's not a problem. Let's say that for the sake of this presentation I use the more generally used Pearson R. Then this test statistic, we will compute it for the real data and obtain a value. Now this value in some cases should be transformed into another form called the pivotal statistic and here from the Pearson R or Spearman R we would transform that into a T statistic. So R is transformed into T for testing purpose. We will need a significance level, then let's say that we use the usual 0.05 level but this depends on how certain we want to be if we reject the null hypothesis. And we will come to the testing method. The testing method that you have learned in elementary classes of statistics is the computation of the T form associated with the R. The testing method is to either look up in the table and find out if the value of R or the value of T is significant compared to the significance level but then you are referring to a table of the distribution of the T statistic which is one of the standard distributions found in books statistics and you can compute also the P value associated with your T statistic using a function in R which is the probability of F or the probability of T. So you can find a P value. But you have to remember that by doing that you are referring to the T distribution of Mr. Student whose name was not Student at all, his real name was Gosset. But since he was working for a famous brewery in Ireland, the Guinness Brewery it was not good for serious people like brewers to be known to do foolish things like writing developing statistics so he had to publish under a pseudo and he chose Student as a pseudo for his publication. So that's another part of the story of statistics. So we can choose this testing method of referring to the T distribution of Mr. Student. Okay. Now when you choose to test using that distribution you have to know that you are making some assumptions you may not remember but they are still there and one of the assumptions is that the data are normally distributed or in the case of regression as Daniel Barca mentioned yesterday that the residuals of the regression are normally distributed and is that the case with your data? If you are working for instance with species abundance data you have never seen species abundance data to be normally distributed. So we have this small problem. What is the consequence of using this distribution when the data do not meet the assumption of the method? I will show you after I present the testing method what is the consequence of using the T distribution when the data do not meet the normality assumption. It can be mild or it can be very strong and when we deal with multivariate data with species abundances with gene frequency data then really our data are nothing like multivariate normal and the parametric testing method is clearly and simply not appropriate and this is why we have to rely on another method which will be the permutation method. So yes this is the section describing distribution of the test statistic blah blah blah and I will use the numerical example to show you how this works, how the permutation test works. So this is my numerical example here. I have two variables. Imagine them to be two vertical vectors but my publisher would not have been happy if I had taken half a page to write two vertical vectors so I wrote them horizontally. They are the transpose of the normal metric. So we have ten points here described by ten, ten points described by two variables and here you have the points here on the left. The left graph, variable one, variable two and here are the ten points. So just by eyeballing you can see that there seems to be some correlation with positive slope but then maybe it could be that there are too few points to distinguish that from a distribution of random values so we still need to do a test of significance in that case. Now when we compute the real value of the test statistic we find this value of R being 0.7. And we transform that into a t statistic that has the value 2.78 and if we look into a table of the t statistic we find a one-tail probability of 0.01 and a two-tail probability of 0.02. Well already a one-tail probability it means in this case that we have chosen this hypothesis that the correlation that we are seeking, that we wanted to obtain, that we were hoping to obtain was positive and the two-tail probability is from this sort of hypothesis here where we have not talked about what we are wishing to obtain. Imagine that the numbers were a bit different and that the correlation with a smaller correlation we could have here a value of let's say 0.04 and the two-tail probability would be 0.08. If you are a strict believer in the 0.05 level which I am not but you know when you publish a paper it's nice to have p-values that are smaller than 0.05 you avoid criticisms from nasty reviewers you can see that the 0.08 would not be significant but the 0.04 would be significant and that's the advantage of using a one-tail test with after thinking about the sort of correlation you are hoping to obtain. The two variables are primary production versus amount of nutrients you are hoping to find a positive relationship so this is the sort of correlation that you are hoping to obtain there is no point in invoking this sort of... on specified alternative hypothesis you can very well specify this sort of hypothesis and you gain what is called in statistics you gain power that is if there is a lot of noise in your data and the correlation is not as strong as it could be well you are more likely to find something significant with a one-tail test than with a two-tail test you gain statistical power by choosing this sort of alternative hypothesis instead of that one is that clear? so now I will look at what we can do to carry out the test of significance without the distribution of Mr. Student I cross it out we have to play a game of the mind to understand the method of permutation we have to say well we know that we have this value for the real data but is that value a strong value or is that a weak value? it would be clearer if I had an example where the correlation was 0.2 for instance is 0.2 strong or small? in other fields like in politics to decide if it is strong or small you could ask people to vote for instance so who thinks that 0.2 is strong? raise your hand or put a ballot in the box and who thinks that it is not strong? and you could rely on the voice of the people to decide but in statistics we think that we have a better solution oh yes in older times the two sides could raise an army and fight to decide but in statistics we avoid this sort of thing and we say let's play a game of the mind and wonder what would the value be if there was no correlation in the data? how can we decide that? the way is to fabricate data that have no correlation repeatedly and then we will look at the values of these correlations of the correlation coefficients when there is no correlation we will do that using our real data because the real data will have the same distribution like lack of normality even if we permute them so if we have our two variables x1 and x2 with the real values here we don't know for the real values if there is a real correlation or if it is like random numbers but we can fabricate a situation here corresponding to the null hypothesis where we are sure that there is no correlation the idea is to take the values here grab them, bring them out of the board put them in a bag, shake them and put them back at random by doing this if there was a correlation in the real data I have broken it I have destroyed it by putting the values at random in one variable or the other or in both actually shaking the two variables does not produce more randomness than shaking only one and it is less work in a computer program to take one variable, shake it and put it back at random and by doing this we preserve the distribution of the original data by using the real data instead of fabricating new data drawn from some random distribution because in this way we retain the distribution of the data if they are normal they will remain normal if they are not normal they will remain not normal and this is what we will do as a testing method so it is either parametric the classical one relying on the t-distribution or permutation a variant of that is bootstrapping in bootstrapping you take these values and sample them at random but with the possibility of repetition of some of the values and other values being drawn out every time that's bootstrapping while plain permutation is simply taking the values and after we shake them and put them back each of the original values is still found in the list but in a different place we could use bootstrapping instead of permutation in these tests with the same results it's just a tradition to use permutation for testing and bootstrapping for the calculation of confidence intervals it is a matter of habit doing that but they could be interchanged with no problem so if we do that then for the permuted data permuted R we will obtain the first value after permuting one of the variables the first value let's say 0.28 that we transform into a t then the second one we do it again maybe we obtain 0.37 the third one maybe we obtain minus 0.02 and so on we do that a large number of times maybe a hundred times maybe a thousand times maybe ten thousand times as many times as you want and we will take all these values including the real one to produce a distribution but this time it is based on these values to which we will compare the true value at the moment we do the test we assume that the real value is part well we know that it is one of the possible values that can be obtained by permutation so just to be sure we put it in the distribution and that makes the test it's very slightly conservative and that's okay we always have to put the chances against us, against what we want to obtain so just a little bit by putting that in the distribution and actually we don't run the method a hundred times or a thousand times but 99 times or 999 or 9999 times so that the number of random results that we obtain plus the real one makes a round number 99 plus 1 is a hundred or 9999 plus 1 is ten thousand okay so now we produce this distribution and then we will look at the position of our real value here it was 0.07 it was something like here the real value was here this is zero in the distribution this is just a frequency distribution the histogram if you like then if this value was near the center here we would say the real value seems very like those that we can obtain by this random assignment and so we would say no this value corresponds really to the null hypothesis because it is very similar to the values produced by the null hypothesis if it was here if it was the most extreme in one tail or the other we would say oh no this value certainly cannot have been obtained by this random process it is too far in the tails but in real cases we obtain something like this where the real value is here well we do it on the t-statistic here obtain every time and the distribution of the t-statistic then is usually not the most extreme one obtained from the random process so we will use these the number of values before or after or in the left-hand tail as a way to compute the p-value and this is what I will show you now okay in this case the distribution after 999 permutations yes here it looks like this and here is my p-value so there are some values found by the random process that are more extreme and you can summarize all these values in a table, a small table like this so this is the number of times that the t-statistic in the distribution is equal to the true value that I have here and actually there is one in my distribution and it is itself because I have put the real value in the distribution so this is why I put a sign here saying it is itself how many are larger in the Instagram now if I wanted to do a two-tail test I would also have to see how many are equal to the minus of the true value there are none how many are smaller than the minus of the true value there are eight and in between so in between here I have 974 values okay from that I can compute a p-value either for one-tail or a two-tail test for a one-tail test so one-tail where I would say row the correlation coefficient is larger than zero then I would take the number of observations that are equal to or larger than the true value 1 plus 17 is 18 divided by a thousand the total number of points that I have in the distribution so this is 0.018 okay if I wanted to do one-tail but in the lower-tail for the alternative hypothesis if I thought that my process should generate a negative correlation then I would have to take how many are smaller than or equal to the real one so it is this plus that plus that so 974 plus 8 is 982 plus 1 983 so p-value is 983 divided by a thousand so 0.983 and if I wanted to do a two-tail test where the alternative hypothesis is that row is simply different from zero then I would take the number of observations from here to there and from there to there so it would be 17 plus 1 18 plus 826 p is 26 divided by 1000 so 0.026 okay so you see that here again the two-tail test produces a probability higher than this one so this one is more likely to be significant than that one because it has more power so here we can compute a p-value just like what we do using the t-distribution but from a distribution obtained from our own data okay that's the story of permutation test then how many permutations should we do and things like that here you have a series of remarks on the permutation test when we do more permutation we obtain more decimal places in the p-value because we have made sure that the denominator was around number we could very well do a permutation with 278 permutations that would be perfectly valid but it would produce an infinite number of decimal places so it is more practical to do it with 99 permutations plus the real one that produces 100 or 1000 or 10,000 so that the number of decimal places increases regularly as you do more permutation there are some cases where you do need more decimal places and it is when you are carrying out several tests of significance simultaneously and then you have to correct the p-values for multiple testing this is another story that I don't have time to go through but there is a box of information in the chapter here describing how this is done and yes, here it is the method for multiple testing you can read that okay, fine you will discuss that excellent well, we are already at the time of coffee break I wanted to show you a small function that does permutation test for the correlation coefficient I will postpone that after the coffee break so we take 20 minutes for coffee back in the room at 11 sharp I will start speaking at 11 o-o okay because we are short on time this morning thank you