 In this video I'm going to look at some past paper questions about hypothesis testing. The first question I'm going to look at is number 7 from January 2008. And here is the question. You might want to pause the video at this point and have a go yourself before you play on. Okay, well the first part asks us to explain what we mean by a hypothesis test and to define a critical region. This is what I'd say if I had to explain what a hypothesis test is. I'd say that a hypothesis test uses a test statistic to accept or reject a null hypothesis in relation to an alternative hypothesis. The null hypothesis states that a population parameter has a particular value and it's rejected when the probability of obtaining a value similar to the test statistic is very small. We also have to define what we mean by a critical region and we can do that in fewer words. I'm going to say that the critical region is the range of values of the test statistic for which the null hypothesis will be rejected. Okay, so now let's move on to the main part of the question. We're told that in term time at a particular school, incoming calls arrive at the rate of 0.45 per minute. To test this, the number of calls during a random 20-minute interval is recorded. What we have to do is to find the critical region for a two-tailed test of the hypothesis that the number of incoming calls occurs at the rate of 0.45 per one-minute interval. We're told that the probability in each tail must be as close as possible to 2.5%. Okay, well we should probably start by writing down the hypotheses. The null hypothesis here is clearly that lambda is 0.45, the rate of calls is 0.45 per one-minute interval. We're told that it's a two-tailed test so the alternative hypothesis will be that lambda isn't 0.45. Now these hypotheses relate to a one-minute interval, but our test statistic is to do with a 20-minute interval. So we need to find the number of calls that we'd expect in a 20-minute interval. Clearly that's going to be 20 times 0.45, which is 9. Okay, so our test statistic will have the Poisson distribution with parameter 9. It's really going to have the Poisson distribution because here we're dealing with the number of events in an interval of time. Okay, now we have to find the critical region and because it's a two-tailed test, the critical region will be in two parts. We can either end up rejecting the null hypothesis if the test statistic is much smaller than 9 or if it's much greater than 9. In other words, if we get a very small number of calls or if we get a very large number of calls, we can end up rejecting the null hypothesis. To find out exactly what the critical region is, we should look at the tables. So first of all, we ought to scan down the column headed up by lambda equals 9 to find the probability which is as close as possible to 0.025. So if we do that, we reach the probability 0.0212, that's as close as we can get to 0.025. And that's telling us that the probability of getting 3 or less is 0.0212. So if our test statistic turned out to be 3, we would end up rejecting the null hypothesis. And clearly, if we got 0.1 or 2, we'd end up with an even smaller probability. So this part of the critical region consists of the numbers 0.1, 2 and 3. The other thing that we need to do is to scan away up the column until we get to the probability that's as close as possible to 0.975. So if we do that, we'll end up at 0.9780, because that's as close as we can get to 0.975. Now that's telling us that the probability of getting 15 or less is 0.9780. But obviously, we're really interested in the probability that we get by subtracting net from 1. And that would be the probability of getting 16 or more. So it's the probability of getting 16 or more, which is as close as we can get to 0.025. And that means that the other part of the critical region is the number 16, 17 and higher. Okay then, so we can say what the critical region is. The critical region is the numbers that are 3 or less, combined with the numbers that are 16 or more. The other part of the question tells us to write down the actual significance level of our test. And we can work this out by adding together the probabilities that we were just looking at. We need the probability that x is less than or equal to 3, plus the probability that x is greater than or equal to 16. And that would be 0.0212, plus whatever we get by subtracting 0.9780 from 1. And if you do that sum, you'll get 0.0432, which is 4.32%, so that's the actual significance level of our test. Okay, let's move on to the last part of the question. And this is comparing the school holidays with term time. We're told that in the school holidays, one call occurs in a randomly chosen 10 minute interval. The task is to test at the 5% level of significance, whether this is sufficient evidence that the rate of incoming calls is lower during the school holidays than in term time. Okay, well again we should start by writing down the hypotheses, and here the null hypothesis is going to be the same as before. It's that lambda is equal to 0.45. Our default assumption will be that the rate of calls is the same as it is during term time, so 0.45 per minute. But this time the alternative hypothesis will be that lambda is less than 0.45. This is a one-tailed test, because we're interested in knowing whether the rate of calls is lower during the school holidays than in term time. Not in whether it's different from the rate during term time. And as before, the hypotheses relate to a one minute interval, but our test statistic comes from looking at a 10 minute interval. So we need to find the expected number of calls in a 10 minute interval, and that's going to be 10 times 0.45, which is 4.5. So this time we have a random variable which has the Poisson distribution with mean 4.5. Okay, well we could answer this question by finding a critical region and working out whether one one call is in the critical region. But if this were me, I would rather just work out the probability of getting one call or less and find out whether that probability is sufficiently low to reject the null hypothesis. So we'll just find the probability that x is less than or equal to 1. 1 because that's the observed number of calls in the 10 minute interval as the actual value of our test statistic. We'll find the probability that x is less than or equal to 1. And of course if this is a very small probability, then we've got a significant result and we can project the null hypothesis. So we look at the tables and we find the column headed up by lambda equals 4.5. And we look across the row from where x is 1 and we see that the probability is 0.0611. So the probability that x is less than or equal to 1 is 0.0611. What's actually greater than our level of significance is greater than 5%. And that means we can't reject H0. We mustn't reject H0 because the probability isn't that low. And what we should say is that there's not enough evidence to conclude that the rate of incoming calls is lower in the school holidays than in term time. You ought to be careful how you express your conclusion here. We mustn't say anything too definite. And we certainly can't say, for example, that the rate of incoming calls is the same in the holiday as it is during term time. We don't have any evidence that that's the case. All we can say is that we don't have quite enough evidence. The probability isn't quite low enough, enough evidence to say that the rate of incoming calls is lower in the school holidays than it is in term time. Okay, so that's the end of that question. The next one I want to look at is number 7 from January 2006. And again, you might want to pause the video at this point and have a go at the question before you listen to anything else that I've got to say. Okay, I hope you've had a go at this question now. Let's go through it. The first part tells us that a teacher thinks that 20% of the pupils in a school read the Dino comic regularly. And he chooses 20 pupils at random and finds that nine of them read the Dino. We're asked to test at the 5% level of significance, whether or not there's evidence that the percentage of pupils that read the Dino is different from 20%. Okay, well, as always, we'll start by stating our hypotheses. And the null hypothesis will be that P, the probability that a pupil reads the Dino regularly, is equal to 0.2, because that's the teacher's default assumption. That's the starting point that it's 20% of the pupils that read the Dino regularly. The alternative hypothesis will be that P is not equal to 0.2, because this is a two-tailed test. The teacher wants to know whether the proportion is different from 20%, not whether it's less than 20%, or greater than 20%, just that it's different from 20%. Okay, well, our test statistic here then is going to have the binomial distribution with parameters 20 and 0.2. If the null hypothesis is true, which remember we always assume whilst we're doing our hypothesis test, the null hypothesis is true. Then we've got 20 pupils and the probability of success, i.e. finding that they read the Dino regularly, is 0.2. It's really a binomial distribution here, because what we're looking at is the number of successes in a sequence of trials. The trials being individual pupils and success being that they read the Dino regularly. Now, to test the null hypothesis, we've got to find the probability that X is greater than or equal to 9. The teacher finds that 9 of them read the Dino regularly, that's more than you would expect. If there are 20 of them and the probability of success is 0.2, you'd expect that 4 of them would read the Dino regularly, and the teacher has actually found 9. So we need to find the probability that we get an outcome like 9, an outcome like 9 students reading the Dino regularly, and like 9 in this context would be 9 or more. So you find the probability that X is greater than or equal to 9, and of course we can't look that up in the probability tables. We can only look up the probability that X is less than or equal to something. So what we'll say is the probability that X is greater than or equal to 9 will be 1 minus the probability that X is less than or equal to 8. So let's look that up. We need to find where n is 20 and where p is 0.2. And then we'll look along the row from 8 and we see that the probability is 0.9900. So we need to do 1 minus 0.9900 and that's 0.01. Okay, well that's the probability that X is greater than or equal to 9. But remember this is a two-tailed test and so there will be two ways we can end up rejecting the null hypothesis, either from getting values that are much bigger than the expected number, or from getting values that are much smaller than the expected number. And so we've only worked out the probability of getting one type of value that's like 9, the values that are bigger than the expected number. And there's also going to be some values that are like 9 but are much smaller than the expected value. So we've only worked out half the region that we're interested in. So we need to double 0.01 to find the total probability and that's 0.02. But anyway, 0.02 is significantly less than 5%. It's still less than 5% and that means we've got a probability that's lower than our significance level. So we have to reject the null hypothesis. And in context that means that we've got enough evidence to say that the proportion of pupils who read the Dino isn't 20%. Okay, the next part of the question asks us to state all the possible numbers of pupils that read the Dino from a sample of size 20 that would make the test that we just did significant at the 5% level. Well, I hope you can see that basically this is asking us to find the critical region for this test, but it's not using the phrase critical region. It's making sure that you understand what we mean by critical region and it's just putting it in a different form of words. But nevertheless, what we have to do is to find the critical region. So let's have a look. First of all, we'll need to scan our way down the column that's headed by 0.2 and stop before we get to a probability that's 2.5%. Here the significance level was 5%, but it's a two-tailed test, so we need to split that in two. So 2.5% at the lower end, 2.5% at the upper end. And we're interested in finding all the cases where the probability is less than 2.5% at either end. So we'll scan our way down the column headed by 0.2, stopping before we cross the 0.025 threshold. And actually we stop as soon as we start and 0.0115. And that means that this end of the critical region is just the number zero. And then the other thing that we do is we scan our way up the same column, stopping before we cross over 0.975. And if we scan our way up, that means we get up to 0.9900. And that's telling us the probability of getting less than or equal to 8. But 1 minus that is the probability of getting greater than or equal to 9. So that means that the other end of the critical region is the numbers 9 or more, 9, 10, 11, 12, and so on. So the numbers that would cause us to reject the null hypothesis, the numbers that would give us a significant outcome, are zero together with the numbers that are 9 or more, 9 all the way up to 20. Okay, let's move on to the next part of the question. This tells us that the teacher takes another four random samples, so five random samples in total, and the extra four contain one, three, one, and four pupils that read the Dino. It's telling us to combine all five samples and use a suitable approximate test at the 5% level of significance to find out whether the percentage of pupils in the school that read the Dino is different from 20%. Okay, well, the null hypothesis is still going to be that P is equal to 0.2, and the alternative hypothesis is still going to be that P isn't equal to 0.2. But this time, if we combine all these results, our test statistic, the sum of all the different random samples, is going to be binomally distributed with parameters 100 and 0.2, because this time we've got 100 trials. There are five samples of size 20, so there are a total of 100 trials, but the probability of success is still the same, 0.2. And this gives us a problem because we don't have tables for when n equals 100, and then it tells us that we've got to use a suitable approximation test. So we're going to have to approximate this random variable that has the binomial distribution with parameters 100 and 0.2 with a different random variable. And we've got two choices. It could either be the Poisson distribution or the normal distribution. But I hope you remember that we use the Poisson distribution when n is large and P is small, whereas we use the normal distribution when both n times P and n times 1 minus P are greater than 5. Okay, so what we've got here, the P isn't particularly small, 0.2 is quite big, so we don't want to use the Poisson approximation, but n times P is 20 and n times 1 minus P is 80. So this is a situation where we can use the normal approximation to the binomial. Okay, so in order to do that, we have to find out the mean and the variance. So the mean of this binomial distribution will be 100 times 0.2, which is 20, and the variance will be 100 times 0.2 times 0.8, which is 16. So the mean is 20 and the variance is 16. So the random variable that we're going to use to approximate x, y will have the normal distribution with mean 20 and variance 4 squared, i.e. 16. Okay, now what's the actual value of our test statistic this time? Well, the first sample had nine students who read the Dino regularly, and then we've got one, three, one and four extras. So that makes for a total of 18 students who read the Dino regularly out of this total of 100. So the probability that we ought to work out if you want to test this hypothesis is the probability that x is less than or equal to 18. Remember, if n is 100 and p is 0.2, then you'd expect to find 20 of them reading the Dino regularly, and we've only got 18, so we need to find the probability of getting 18 or less. And to find the probability that x is 18 or less, we will look for the probability that y is less than or equal to 18.5. Remember here that we're approximating a discrete random variable with a continuous one, using the normal distribution to approximate the binomial distribution, and whenever you use a continuous random variable to approximate a discrete one, you have to make a continuity correction. So the chance of getting 18 or less, the chance that x is 18 or less, will be the chance that y is 18.5 or less. The reason for that is that the chance of y being exactly equal to 18 would be 0. A continuous random variable has zero probability of being exactly equal to any one particular value. So the chance that x is 18 would be about the chance that y is between 17.5 and 18.5. Okay, so we make the continuity correction, and we're going to find the probability that y is less than or equal to 18.5. So we do that by standardizing. It's the same as the probability that z, the standard normal variable, is less than or equal to 18.5, take away the mean, take away 20, divided by the standard deviation, so divided by 4, and that's the probability that z is less than or equal to minus 0.375, which is the same as one take away the probability that z is less than or equal to positive 0.375. We have to do that change because we're not going to be able to look up negative 0.375 in the tables. Here are the tables, and we can't actually see 0.375. Our choice is between 0.37 and 0.38. Let's round up to 0.38. That's allowed. Examiners are happy for you to do that. And the probability that z is less than or equal to 0.38 is 0.6480. So we have to do one take away 0.6480. And the answer turns out to be 0.3520. Okay, so the probability of getting 18 or fewer pupils who read the Dino regularly is 0.3520. But remember, this is a two-tailed test. So this is actually only half the probability that we're looking for. Remember, we want to know the chance of getting an outcome like 18. And one type of case where we've got an outcome like 18 is where we've got less than 18. But we can also have outcomes that are more than the expected number of pupils, so that are more than 20. And that would be the same probability as well. So we need to double this probability. It's a two-tailed test. You must double the probability. And that gives us 0.7040. Well, this is massively more than 5%. The probability isn't small at all. And so in this case, we're clearly not going to reject the null hypothesis. And we're going to say that the combined sample actually suggests that the proportion of pupils who read the Dino may well be 20%. I'm not going to say that we've got evidence that it is 20%, because we certainly don't have evidence that it's 20% rather than 19% or anything. But nevertheless, the evidence that we've got is totally consistent with the possibility that the proportion is 20%. It's very likely that we would find a total of something like 18 pupils reading the Dino regularly if the actual proportion is 20%. There's one more part of the question. It asks us to comment on our results. And I suppose there's a very obvious thing to say. First of all, we've got different outcomes. The first test turned out to be significant. And that led us to think that maybe the proportion is different from 20%. But then the second test had the opposite outcome. And there it seemed that the proportion may well be 20%. But then I presume we're expected to say, well, what should we conclude as a result? We've got one test that goes one way, one test that goes the other way. So what should we say? And I think the point is that the second test is going to be more reliable because you had a much larger sample size. And therefore it's the second test that we should go with. We ought to say that the proportion may well be 20%. Okay, now I'm going to look at one last question. And this is question four from June 2002. Again, it'll be good for you to stop the video at this point and to have a go at this question before you listen on. Okay, so the first part tells us that 20% of customers who buy crisps in a large supermarket buy them in single packets. But on a particular day, somebody has taken a random sample of 25 customers and looked at what sort of crisps they've bought and found that two of them have bought them in single packets. So we're told to use the data to test at the 5% level of significance whether the percentage of customers who bought crisps in single packets that day was lower than usual. I've got to state our hypotheses clearly. So this time the null hypothesis will be that P is equal to 0.2. The default assumption is that the proportion of customers buying crisps in single packets is 20%, so that's 0.2. And it's a one-tailed test. We're interested in whether the proportion is lower than usual, not whether it's different to usual, whether it's lower than usual. So the alternative hypothesis will say that P is less than 0.2. And then what we've got is a random variable which has the binomial distribution with parameters 25, 25 because the random sample had 25 customers in it, so we've got 25 trials, and 0.2 is the probability of success. 0.2 is the proportion of customers that we suspect tend to buy crisps in single packets. I hope it's obvious that we're going to be looking at a binomial distribution here because we've got the number of successes in a sequence of trials. Okay, so we need to find the probability that X is less than or equal to 2, the probability that we get two or fewer customers buying two crisps in single packets. And we can find that by looking at the probability tables. We need to find where n is equal to 25 and look down the column where P is equal to 0.2. And then if we look along the row from where X is 2, we get the probability 0.0982. So the probability that X is less than or equal to 2 is 0.0982, and that's bigger than 5%. That's greater than the critical value. So it's not very unlikely that we would get an outcome like this. It's not very unlikely that we would get as few as two customers getting Christian single packets. So we can't reject the null hypothesis. Okay, we can't say that the proportion of customers buying Christian single packets is lower than usual. We don't have enough evidence to conclude that the percentage of customers buying Christian single packets is lower than usual on this day. Okay, let's move on to the next part of the question. So now we're told that at the same supermarket, the manager thinks that the probability of a customer buying a bumper pack of crisps is 0.03. And to test whether or not this hypothesis is true, the manager decides to take a random sample of 300 customers. Okay, so we're told to state our hypotheses clearly and find the critical region to enable the manager to do this test, to find out whether the probability is different from 0.03. And this time we're told that the probability for each tail should be as close as possible to 2.5%. Okay, well, this time the null hypothesis is going to be that P is equal to 0.03, because that's the proportion of people that we think buy crisps in bumper packs. And this time we have got a two-tailed test because we're interested in whether the probability is different from 0.03. So it's a two-tailed test and the alternative hypothesis says that P is not equal to 0.03. Okay, well, our test statistic, the number of customers that we actually find buying crisps in bumper packs will have the binomial distribution with parameters 300, it's got 300 trials. There are 300 customers in our sample and 0.03 is the probability of success. Okay, well, we're not going to be able to find binomial probability tables which have n equals 300 in them. So we're going to have to use an approximation here. So we're going to have to approximate the binomial distribution. And this is a situation where we can use the Poisson distribution to approximate the binomial distribution because we've got a large value of n and a small value of P. We could also use the normal distribution if you wanted to because it's also true that n times P is greater than 5 and also that n times 1 minus P is greater than 5. But whenever you have the choice, it's easier to use the Poisson distribution than the normal distribution to approximate the binomial distribution because the process is just much simpler. You don't have to make a continuity correction, you don't have to do any standardizing, you just have to look up the number in the tables. Okay, so we need to find the expected number of customers who would get crisps in bumper packs so that we can use a Poisson approximation. And the expected number would be 300 times 0.03 which is 9 and we can use the approximating random variable y which has the Poisson distribution with parameter 9. And so what we need to do is to find the critical region for y i.e. the range of values of y which would end up giving us probabilities less than 5%. And actually, if you remember the first question I looked at, this is exactly the same. We've got to take the column headed up by 9 and scan our way down it stopping as close as we can get to 0.025 which is 0.0212 and then scan our way along that row and that shows us that that part of the critical region is the numbers 012 and 3. And then we've got to scan our way up that column stopping as close as we can get to 0.975 and then we look our way along the row below to find out that the other part of the critical region is the number 16, 17 and higher. Remember, it's the row below because 0.9780 is the probability of 15 or less but 1 minus that is the probability of 16 or more. So the critical region is when x is less than or equal to 3 together with when x is greater than or equal to 16. And the final part of the question where we have to say the significance level of the test will be the sum of the two probabilities that we were just looking at just as before is 0.212 plus whatever you get by subtracting 0.9780 from 1 which is 0.0432 which is 4.32%. Notice that this question was a little bit different to some that you see because we had to use an approximation and then find a critical region but there wasn't anything particularly different about it. We're just looking at the table of values for our approximating random variable and looking for 2.5% and 0.975 but otherwise there's nothing scarily different about it. Okay, so that's the end of this video where I've looked at three pass paper questions on hypothesis testing. I hope that you found it useful. Thank you very much for watching.