 Good day. So in today's video lecture, I will cover the topic of hypothesis testing which you can see on the board behind me. This is the third and the last topic of statistical inferencing that we will cover as part of this course. So as I said earlier that this is an interesting take on the same information that you have using confidence for building confidence intervals, you can use it for answering a different type of a question. So I've already uploaded the slides on Moodle, these slides I acknowledge my colleagues Prasamani Bhushan and Prasasachin Patwardhan who have made these slides and kindly shared them with me. So this will be chapter eight of our textbook and this also takes material from Montgomery and Runger as well as from Organikey's random phenomena. Now so far the other two statistical inferencing methods that we covered in the class was the point estimate where we looked at maximum likelihood principle and we used confidence intervals or interval estimates for a population parameter given the samples. Again the setting is very similar to what we have been doing so far that you have a population so if I write here a population which has a density f of x and there is some parameter theta that is embedded in it, the parameters that we have looked at have been the mean and the variance. Now from this you have certain samples so the random variable is x and you have certain samples like x1, x2 and x2, xn so this is the same setting that we have used for point estimates as well as confidence interval estimates for the parameter theta. Now so these were the two methods that you have looked at today we look at a third method using similar information and this method is known as hypothesis testing. Now imagine that there is a courtroom and in the courtroom the judge is presented with certain evidences, the evidences are not clinching but the evidences are circumstantial and the judge has to make a decision about some claim or about some assertion or in this context we will call it a hypothesis. For example the hypothesis is that Mr. X has stolen something ok so that is the hypothesis the judge is given certain evidences and based on those evidences the judge has to say whether Mr. X has stolen something or Mr. X has not stolen something ok. In hypothesis testing we have a very similar situation there is a claim or an assertion which is made on the parameter for example I might say that my mean is equal to some value the mean of the population. For example I might say that my lecture times are always equal to one hour the duration of my lecture is equal to one hour. Now if you have data in form of samples like X1, X2 to Xn so if you have n data points of actual lecture timings then you can so this is the circumstantial evidence that you have and from this circumstantial evidence you want to be able to make a statement that whether this is true or it is false. So this kind of a problem is solved in a hypothesis testing formalism and this was proposed by Fisher one of the stalwarts of statistics and it has had a very very very popular use in solving engineering problems ok. For example if I tell you that the temperature of this room is 27 degrees you make a measurement X1, X2 to Xn and then you try to see whether the data is consistent with my claim that the temperature is 27 degrees or not. So again we are talking in terms of a stochastic formalism where 27 so it is not deterministic the temperature could be fluctuating and so you want to be able to say whether the mean mu is equal to 27 is that claim is consistent with whatever information I got in terms of the samples ok. Now this is very very close connection between hypothesis testing and confidence intervals you will remember that have I yes I have. So you will remember that in case of confidence intervals we had six different scenarios and those six scenarios we will also use in the current instance however this module just talks about the first two scenarios that you have encountered in the confidence interval but the methodology is very very generic and you should be able to extend it to all the six scenarios that were discussed ok. So a hypothesis is a statement about the parameters of one or more populations. So you remember that in case of confidence intervals we talked about one population case and we talked about two population cases. So as I just said that there is a one to one correspondence between hypothesis testing and confidence interval so in that context you should be able to understand that you could have parameters like mu or you could have two populations and so you had a parameter like mu one and mu two and you wanted to know whether the difference between the two populations is zero or not ok. Now yes so the parameter that we are talking about is again theta that you have already encountered whenever we were writing the probability density function. So let me go to the next slide ok let us take an example so this is an engineering example suppose you are interested in the burning rate of a solid propellant which is used to power rockets that launches satellites. So you know that a solid propellant is the one that gives energy and the burning rate will be able to describe what is a thrust generated by that rocket in order to launch the satellite. So if you are buying this solid propellant from another company and they insist or they say that you know we have made this or designed the solid propellant so that the burning rate is 50 centimeters per second ok maybe that was the customer specification and the company says that yes you know we have designed it in such a way that it gives you a rate of 50 centimeters per second. So when they say it is 50 centimeters per second they really would like to claim that this is my population mean mu ok. Now this claim is a exertion or is a hypothesis that mu is equal to 50 you have no way of verifying whether this is correct you could not take the entire population of the solid propellant burn them and then try to determine what was that mean ok. So in this case it is a destructive test it would turn out to be a destructive test. So you would rather take a few samples end of these perform the burn rate experiment and generate data from that data you would like to know whether this claim of mu is equal to 50 is consistent with the data or is not consistent with the data ok and thereby come to a judgment just like a judge would use this circumstantial evidence to come to a conclusion whether Mr. X has committed a crime or has not committed a crime ok. So let us look at it a little bit more formally the mu is equal to 50 centimeters per second is known as the null hypothesis H naught is a very standard terminology you will always find it being called H naught and it stands for the null hypothesis. So this is the default ok the alternate hypothesis is H 1. So in this case we are looking at what is known as a two sided hypothesis test and in this two sided hypothesis test we say mu is not equal to 50 centimeters per second and this H 1 is known as the alternative hypothesis. So before use before the judge makes a decision the judge should know what are the two hypotheses between which he has he or she has to choose. So the null hypothesis is that the burn rate is 50 centimeters per second the mean burn rate the population mean burn rate is 50 centimeters per second against the alternative hypothesis that the population mean burn rate is not 50 centimeters per second. So this is known as a two sided hypothesis test you could have had a one sided hypothesis where you say that the H naught the null hypothesis that mu is equal to 50 versus mu is greater than 50 ok. So this is a one sided hypothesis test that mu is greater than 50. So this would be an example where the company itself would like to do this hypothesis test and let us say having a greater burn rate is a good thing ok. I am not sure if that is the case but let me assume for a moment that having a greater burn rate is a good thing and so I would say that mu is equal to 50 is the null hypothesis and mu is greater than 50 is the alternate hypothesis and I would use data and try to prove that my burn rate are greater than the customer specified 50 ok. On the other hand if the customer could make a check on this hypothesis where they say that no we had asked them to give us a burn rate of 50 or greater but you know is mu less than 50. So this would be a hypothesis test that the customer would do. So you will remember that this is very similar to we had a discussion on how do you choose whether you will have a lower confidence interval and upper confidence interval or two sided confidence interval. So this is very similar to that if you were interested in knowing that it is 50 or not then you would do a two sided hypothesis test. So that would be a situation where having either greater than 50 or less than 50 both are detrimental to my launching of the project or launching of the rocket ok. On the other hand if it I need you know I do not mind having more than 50 but it should at least be 50 and then the company would try to do this and show that you know the alternate hypothesis is selected. If I wanted it to be greater than equal to 50 but I want to check whether it is at least 50 ok. So here I would then choose this particular one sided hypothesis test. So again please think about it how in different situations you would choose different hypothesis. So a hypothesis is a statement about population and not about the sample. I hope this is very clear and every in all the statistical inferencing we have done we are using a sample to tell us something about the population ok. So it has nothing to do the sample is only a vehicle or is the evidence in case of hypothesis testing. The question comes as to where does this number 50 come from ok. Why did you choose 50 and why not 51. So it depends on the domain. So it might be a past experience. You might know that it has to be 50 for it to work well and you want to test whether the material that I am buying and going to use in the in the rocket or the satellite has satisfies that number or not because if it does not maybe the rocket does not go on its trajectory properly ok. There could be some other considerations. One of the considerations that does happen is contractual obligations ok. So in contractual obligations you have a vendor and a customer or a client and they have a contract and that contract for example this happens with selling of gas through a pipeline. So you have these gas producing companies and they supply for example you have gas authority of India, Gale and so on and they supply gas through a pipeline to their clients ok. So you have like some company like for example you have Gale gas authority of India limited and they are supplying their gas to a particular customer. Now there is a certain rate because then they will buy based on the amount of gas that has been supplied. But if you have ever seen how a flow rate of gas looks like let us say you are using an orifice meter it is never rock steady ok. So you cannot say that I have got x meter cube per hour ok. What you have is a very fluctuating kind of a situation. So this is what the gas rate is ok. So you want to know whether the population mean is equal to that which was agreed upon in the contract. So how do you decide? So you would like to take a do a hypothesis test use data do a hypothesis test and then come to a conclusion that I am satisfying contractual obligations or I am not. So there are many interesting situations where hypothesis testing is used ok. Now the test hypothesis testing itself refers to the procedure ok it uses the random sample and as we have seen that essentially you have a null hypothesis and so the null hypothesis is the hypothesis that we wish and should be able to test and we have seen that you have an alternative hypothesis. Now in practice the null hypothesis which is H naught is that hypothesis that you the tester would like to check for ok. So that is the default. So you should think of the null hypothesis as the default option and the alternative hypothesis is if I reject the null hypothesis then what is the alternative? So for example in this case mu is equal to 50 was the null hypothesis it was the default ok and if I reject the null hypothesis then what do I end up with? I end up with the alternate hypothesis that mu is not equal to 50 or over here that mu is greater than 50 or in this case mu is less than 50. So the null hypothesis is that you want to test ok. Now if the data which is in form of the sample is consistent with the null hypothesis then you would like to accept it you will say I accept the null hypothesis that mu is equal to 50 otherwise you will reject it and therefore by rejecting it you are in some sense saying ok I am going to accept whatever the alternative hypothesis is ok. Now we have seen that this is a stochastic setting and so there is no 100% certainty and so we take the sample n of n consisting of n observations and we would like to check whether the null hypothesis is the data is consistent with the null hypothesis or not. Now because of the fact that this is not deterministic we will end up making an error in the judgment ok just like a judge would look at the circumstantial evidence and you know they say that the preponderance or the weight of the evidence suggests something but there is always a chance that an error would be made yeah an error would be committed in making that judgment. So in this case we look at that error in judgment more systematically ok or we can address that issue more systematically. So you have a hypothesis test which always requires a test statistic now you will soon realize that these test statistics are something that you have already looked at when you were looking when we were going through module 6 which was in context of the interval estimates. These test statistics so for example a test statistic for the mean mu ok is nothing but x bar and you know what x bar is you have the n samples and you take their arithmetic average so it is 1 over n times summation of x i the observations x 1 to x n and of course this is over all of i ok that is the test statistic of mu similarly you can imagine what is test statistic for the population variances. So the sample average x bar is the test statistic that is used now in hypothesis testing if x bar is close to 50 then you accept h naught otherwise you reject it. So this is similar to a judge if the observation suggests that mu is equal to 50 then you accept that mu is equal to 50 otherwise you reject it for example you might decide that ok the x bar has to lie between 48.5 and 51.5 if x bar lies between these intervals or in this interval then I am going to accept h naught otherwise I will reject h naught and therefore I will in some sense accept h 1 ok. So essentially let me I think maybe I can just go to the board so for us x bar is a random variable which has a particular distribution and so I should be able to visualize the 48.5 and 51.5 in that using the probability density function so let me draw that on the board and I will come back ok. So I have tried to show that using the probability density function so you will observe that you have mu is equal to 50 this is h naught ok. So if my x bar this is a density of x bar if it lies between 48.5 and 51.5 as shown on the board I am going to call this I am going to accept h naught and if it lies beyond this on either side I am going to reject h naught. So this region over here is known as the acceptance region and this region where you reject h naught on either side is known as the critical region so that is the terminology that we use. So if it is if your x bar that you have which is your test statistic x bar so if the test statistic lies between 48.5 and 51.5 I am going to accept the null hypothesis and because it lies in the acceptance region. If x bar lies outside the acceptance region then I say that it lies in the critical region and I am going to reject it so that is the way that we do a hypothesis test. So what I have drawn on the board is what you have over here. If x bar is between 48.5 and 51.5 then you accept it our region outside or less than 48.5 and 51.5 is called the critical region ok. What is important is that the values of 48.5 and 51.5 which are the boundaries between the acceptance and the rejection region are known as the critical values. Now so far we have just arbitrarily taken these numbers 48.5 and 51.5 and the question is whether we can be more systematic about it. Now let me just come back to the board and let us go back to this now you will see that on either tails I will think of the probability on either tails to be alpha by 2. So this is similar to the two sided confidence interval. So the acceptance region has an area of 1 minus alpha and the rejection region has an area of alpha by 2. So and so let us continue with the slides the error that you can make in the judgment are of two types one is known as the type 1 error. So if this is the truth that H naught was indeed true and you did not reject H naught which means you accepted H naught then you have committed no error. On the other hand if H naught was true that is mu was indeed equal to 50 but you said mu is not equal to 50 we call it as a type 1 error that is a terminology that is used. On the other hand if mu was not equal to 50 which means H naught was false and in fact H 1 was therefore to be accepted but you failed to reject H naught. So your test using those n samples said no mu is equal to 50 is correct but in reality it was not correct then we say that we have committed a type 2 error. So these are the two types of errors that one could commit in while making our decision. Now so again type 1 error is rejecting the null hypothesis H naught when it is true and type 2 error is that you fail to reject the null hypothesis when it is false. So there are other terms which are commonly used for type 1 error and type 2 errors we often call type 1 error as a false positive. So think that there is a problem that you are trying to detect and you said that oh the problem has occurred. So you indicated that the problem was positive but in reality that problem did not occur. So H naught is true but you rejected H naught then you said then it leads to a type 1 error. On the other hand if indeed mu was not equal to 50 but you failed to detect that mu is not equal to 50 then you have committed a type 2 error which we can call as false negative. That means the problem was actually there but you declared it as negative that the problem is not there and that is false. So you have a false positive or a false negative. Very often we also use the term as a false alarm. So I will just write it as F A that is not a standard terminology or if the type 2 error so a false alarm is a false alarm as the word indicates. So you all remember the story of that wolf and the boy who was tending to the sheep would cry wolf and everybody would come. So let us say the boy was not so mischievous. The boy really thought that there is a wolf and but when the people came they saw that no it was not a wolf it was some wind which was blowing and the boy thought that it was a wolf and this turned out to be a false alarm. So what will happen if the boy keeps on raising false alarm the people will never come to help him. So that is the problem with too many false alarms. On the other hand if the boy was asleep and really a wolf came then the boy did not raise the alarm and that was the case of mis-detection. So that is the other terminology mostly used in fall detection and diagnosis. You call it a false alarm or you call it a as just write it as M D again not a standard acronym I will call it as a mis-detection. So the boy forgot or did not raise the alarm and so mis-detecting the presence of a wolf that is a type 2 error. Now we want to be able to say what is the probability of committing a type 1 error and a type 2 error. So we have to be able to make this distinction. So we will call alpha as the probability of committing a type 1 error. So type 1 error means you are rejecting the null hypothesis H0 when H0 was indeed true. And this is also known as the significance level or the alpha error of the or size of the test. So in our case let me go to the board because I have already drawn oh I missed reducing this. Okay that is why I should not go to the board very often but let me quickly go back and go through I think it was this slide okay and then I would have come to this slide. So again we were looking at errors and the decision procedure can lead to either of two wrong conclusions and one is a type 1 error. I will just quickly repeat again type 1 error is that the null hypothesis was true but your samples ended up using and your hypothesis test ended up rejecting the null hypothesis. This is type 1 error okay. Type 2 error is when the null hypothesis is indeed false but you were not able to detect it. Your testing procedure did not say that it is false. When I use the term false positive and false negative you can go back to that discussion and I also use the term false alarm and missed detection and I was saying that these terminologies are very commonly used in fault detection and diagnosis literature. So for example if they are the reactor and they are the temperature and you want to make sure that the temperature of the reactor is kept in some limit. So you have the mean value okay and you have an upper control limit as we call it and you have a lower control limit okay. So you can perform a hypothesis test to find out whether the mean value because the temperature is going all over the place okay and you want to raise an alarm if the temperature is not at its target value. So this will alert the operator and the operator will go and take corrective action. This is what a control system does. So if I have set the alarm in such a way that it always goes off that means that mean and when the operator goes and checks he or she finds that the temperature is alright then you have it has committed a type 1 error that is the alarm is false. On the other hand if the temperature was really going away from the target value but the alarm was not being raised then this is a case of missed detection. Let me check this once again yes so let me go to this slide. So we were trying to say that these judgment errors in the judgment we can ascribe a probability to it and the probability that we will ascribe to a type 1 error is called alpha okay. So we are essentially trying to say that and this is where I was trying to go to the board when I so let me just go and remember to come back to the desktop. So as you can see that 1 minus alpha over here is the is the is the probability so this probability distribution function has been drawn using mu is equal to 50 okay. So we have assumed while drawing this probability density function that mu is equal to 50 is indeed true okay if we make that assumption I can draw this probability density function and now I have these two bounds where I have put the area as alpha by 2 and alpha by 2 and that is the if the distribution really belong to H0 which means mu was 50 but if there was an outline observation like which lay in either of the tails then I would end up rejecting H0 when H1 0 was true so I would be committing a type 1 error and what is the probability of committing that type 1 error the probability would be alpha by 2 plus alpha by 2 which is alpha. So you should realize that alpha is the probability of committing type 1 error so let me go back to my desktop. So in this case type 1 error would occur when your x bar value lay in the critical region not in the acceptance region but the true mean was indeed 50 now you will you can ask me how do you know that the true mean is indeed 50 and so in this case it is under the assumption of H0 okay so that is an assumption and if you knew that if you knew the true value of or the value the true value of the population mean then you would not be doing this test okay so there is a chicken and egg issue that we deal with. Now supposing that the standard deviation is 2.5 centimeters per second so you know that x bar belongs to a normal distribution with mean being mu and the standard deviation being sigma by root n and we have seen many a times that even if the individual xi were not normal that x bar is tends to be normal like because of the central limit theorem. So when H0 is true so under the assumption of H0 my distribution of x bar would be a normal distribution with mean 50 and the standard deviation being 0.79 so you should realize that 0.79 is sigma by root n sigma is 2.5 and n I believe is 10 okay so this is based on 10 samples so now you can try to find out what is the value of alpha. So assuming that H0 is true for the case when mu is equal to 50 you can ask yourself what is x bar what is the probability that x bar is less than 48.5 and x bar is greater than 51.5 and you know how to find this probability you have to standardize this variable so you subtract mu which is 50 in this case under the assumption of H0 so you subtract out 50 and then you divide this by sigma by root n so if you do that this will belong to the unit standard normal distribution and so you can use your z tables or you can go to R and find out those values. So this turns out 48.5 turns out to be minus 1.9 and because this was symmetric we had chosen it to be symmetric 51.5 will turn out to be plus 1.9 so alpha therefore is the probability on the two extreme tails and this probability turns out to be 0.058 so this probability explicitly tells me that the 5.8 percent of all samples will lead to rejection of the hypothesis mu is equal to 50 when mu is indeed equal to 50 which means that this is the error in judgment that I will make of type 1 error that even though the true population mean of 50 5.8 percent of the times I will reject it and therefore commit a type 1 error. Now the slide insists that you should try to sketch this probability density function and indicate the type 1 error so I request you to take a pen and a paper and just try to sketch these. Okay so a question often is that how do we reduce the type 1 error and you will say that seems straightforward enough so if I wanted to reduce the type 1 error let me go to the board okay so alpha by 2 for me was 5.8 so it is going to be 0.029 on both sides okay now if the question is you are not happy with this 5.8 percent you want to be able to reduce the error so then instead of 48.5 and 51.5 you can choose two new critical points or you can choose new critical points as 48 and 52 okay as 48 and 52 so if you use 48 and 52 then you can see that the area of the curve on either side beyond 48 and 52 will become even lower and you can do a quick calculation and see that that alpha is just 1.14 percent okay so this is one way of increasing your acceptance region so that you do not reject H0 when H0 is true. Now the other way is by so the other way is by increasing the sample size so for our example if you know we had n is equal to 10 if you increase the number of samples from 10 to 16 and you take the same standard deviation then the 48.5 to 51.5 will correspond to type one error of 1.6 percent and not 5.8 percent as before okay so you could have the same 1.6 percent type one error using the same margin of 48.5 and 51.5 provided you had larger number of samples okay and you should be able to visualize why that is the case you will see that as the n goes up you know that the standard deviation sigma by root n comes down and if that happens then this curve becomes flatter oh I'm sorry it becomes narrower okay and now you can see because it has become narrower under on the left of 48.5 or on the right of 51.5 the area under the new curve I don't know if I have another color I don't think I have another color the area under the new curve the narrower curve is much lower and that was the 1.6 percent okay so as n goes up as the number of samples go up you can make a better judgment all right now it is important that we also talk about the type 2 error probability now we will call the probability of committing type 2 error as beta okay and remember that type 2 error was the case of mist detection that is H0 was false but you failed to reject it okay so that is the case of mist detection now we would like to compute beta but unlike alpha for if you now remember that H0 had a specific value of mu given to you but H1 did not have a specific value and that deters us from trying to calculate the value of beta so if you have the specific value that so if you say that H0 is mu is equal to 50 which is over here okay and H1 if you were to give me a specific value which is not given when we do this test but let us say instead of saying mu is not equal to 50 you say that mu is equal to 52 okay a specific value so this is a scenario that of course H0 got rejected or H0 is mu is equal to 50 but let us take a scenario under H1 where mu is equal to 52 okay then the acceptance region for H0 as we discussed was 48.5 to 51.5 so then the probability of committing a type 2 error would be that the true distribution has a mean of 52 so under H1 the mean is 52 that is the scenario that we have adopted under H1 but the sample fell in the acceptance region of H0 that is it fell under between 48.5 and 51.5 so that would be the situation where actually the H0 was not true mu was not 50 in fact it was 52 okay but the sample value fell in the region in the acceptance region of H0 so that value is known as beta or the type 2 error probability now we go back to the board and just draw that other distribution so what we are saying is that the true distribution was 52 okay so we are saying that the true distribution is actually mu is equal to 52 so but what is the error that will be committed in using this mu is equal to 50 will be basically the area and I am going to shade it over here so the shaded area is telling you what is the probability under H1 is equal to 52 but that which lies in the acceptance region of H0 okay so that is the value of beta so in our example if we had taken 48.5 and 51.5 then we had to calculate the area under the curve when mu was 52 and so I am going to normalize this or you know go to the standardized variable by subtracting 52 from both sides and dividing by sigma by root n so those values are shown over here and it turns out that with respect to 52 distribution the standard variables the unit standard normal variables take up a value of minus 4.43 and minus 0.63 okay I hope that is clear and if you do that you can find out that the value of beta turns out to be 0.264 so roughly if the true value was of mu was 52 but you were your null hypothesis had mu equal to 50 then roughly 26.4 percent of the times you would end up missing detection of the fact that the mean is not equal to 50 okay you would miss saying that out okay so the question is what does beta depend on okay and you have already from the figure that I drew on the board you will realize certain things it depends on how close the 52 and 50 are if let us say instead of mu is equal to 52 so the hypothesis test I am doing is still for mu is equal to 50 but let us say instead of mu is equal to 52 I was checking with mu is equal to 55 then beta value would change if I had made it mu is equal to under h1 I would say mu is equal to 49.5 the beta value would change okay because the second curve would keep on shifting with respect to the first curve which is under H0 and beta also depends so one very important point is this tradeoff between alpha and beta in general as you try to increase the or decrease the type one error then beta increases okay and again you should be able to look at that graph and be able to say let me see if I can just draw it right over here so this is under H0 okay let us say these were my critical points now I have a beta and let me draw them like this so remember that the beta value is this okay and all the way into the other end of the acceptance region now if I were to change this critical region if I were to in if I were trying to sorry if I were trying to reduce the value of alpha and how will I reduce the value of alpha not by increasing number of samples but by saying that my new value of alpha is this okay so if my new value of alpha is this then you know that the same curve this is under H1 which I am trying to use the pencil and indicate then now you can see that there is more amount of area under once you have moved the critical point towards the extremes of the tails so in that case the new value of beta has this additional amount okay and this goes all the way to the other side so the value of as you try to reduce the value of alpha what has happened to beta beta has actually gone up okay now beta also depends on on the sample size n and as n increases beta decreases and again you should be able to draw a narrower density and be able to rationalize that this is indeed the case this is a summary for the propellant example and I would encourage you to try to calculate these numbers on your own we have already looked at this particular example when it when mu was 52 so this is scenario 1 under H1 okay I will write it as S1 and this is scenario 2 so to be able to calculate beta you really need scenarios otherwise you don't know for what mu value you should try to calculate whereas in case of calculation of alpha you just need the hypothesis the null hypothesis value mu is equal to 50 so so in that case it is always easy to calculate but here you need different scenarios so you can see like we discussed I just discussed that if you had brought mu closer to 50 you made it 50.5 for example then the value of beta has increased again just draw it and you will see that that is the case okay so these are for sample 10 and these are for samples number of samples being equal to 16 please do these calculations using the normal distribution table the unit the standardized unit normal distribution table given to you at the back of your book so again alpha beta are related decreasing one increases the other for a fixed n reducing the size of the critical region it decreases alpha but it increases beta if your n is fixed generally if you increase n that is the only way that both alpha and beta can reduce or can become lesser so when H0 is false then beta increases as the true value of the parameter approaches the the value hypothesized in the null test so when if let us say this was your H0 and so this was mu is equal to mu0 this is mu0 and if I have under H1 okay then this is the scenario one which I will write for now is mu s1 okay scenario one now as you come choose different scenarios where you keep approaching closer and closer to mu0 you will see that the value of beta increases so you can do a simple thought experiment to convince yourself of that okay some more conclusions are that how do you choose alpha and you typically choose alpha to be very small and why is that I told you in the beginning that the the case of the case of the null hypothesis is the default okay and alternative hypothesis is is what you would try to show using the data okay so you really want to be very sure when you want to be able to discredit the null hypothesis and therefore establish the alternative hypothesis and so it is very important that alpha be small the probability of committing type one error should be small imagine this you have a cooling system and you are you feel that it is not working correctly so your null hypothesis is that it is working correctly okay and your alternate hypothesis is that it is not working correctly so you want to make sure that the error in committing the error in trying to reject or saying using samples that it is not working correctly when in fact it was working correctly should be very small and why is that because if you make that judgment that it is not working correctly while it was indeed working correctly you would end up spending huge amount of money buying a new cooling system and doing away with the old one so that is why alpha values should be small so you should choose a typically a small value of alpha we choose the it to be 1% or 5% sometimes even maybe 10% now since probability of wrongly rejecting H naught can be controlled which means by choosing alpha rejection of H naught is a strong conclusion okay what this says is that as a tester you should try to you should use H 1 as that which you are trying to establish okay and H naught as the default because failing to reject H naught is not a strong conclusion but rejecting H naught is a strong conclusion because the error in making in rejecting H naught when H naught was true that is alpha is we choose it to be very small okay so rejecting H naught is a strong conclusion and this also tells you so for example we use this example of nicotine content in us in in cigarettes and so if you are a regulator okay then you would say that the nicotine content is equal to that which is prescribed okay so you bought some some of that product from the market and you've done a test so you will say that so your H naught will be that mu is equal to that which is specified the target and your alternate hypothesis in this case will be that no it is greater than the target okay so you would do a one-sided test because H 1 should have that which you are trying to prove and why is that it is because proving H 1 or is a proving or this is H this is H 1 and this is H naught okay so proving H 1 is a strong conclusion failing to reject H naught is not a strong conclusion it is the default okay so always you should that which you are trying to prove should be in H 1 okay now the type 2 error beta is not a constant but as I told you it depends on the scenario that you are looking at okay and so it depends on that scenario it depends on the sample size and cannot be controlled independent of alpha so we've always seen that if you try to reduce alpha beta increases and because of fact that you can only assume some scenarios beta is often unknown but you can do a scenario analysis okay so we've already discussed that rejecting H naught is a strong conclusion failing to reject H naught is a weak conclusion okay and therefore you should always try to that which you're trying to prove should be in H 1 okay quickly let's look at the power of the test now a power of a statistical test is the probability of rejecting the null hypothesis H naught when the alternative hypothesis H 1 is true so this therefore is equal to 1 minus beta because the probability of rejecting the null hypothesis H naught when alternative hypothesis H 1 is true so it is it tells you the power of that test and different statistical tests are often compared based on their power for a given alpha okay all right so the overall procedure we have to summarize is you identify the parameter of interest you state the null hypothesis state the alternative hypothesis choose alpha determine the test statistic now the moment you determine and you know that what it varies as you can tell and you have already chosen alpha so with these two pieces of information you will be able to give me the critical points as soon as the critical points are given I know my critical region C has got established now that is the critical region then I compute the test statistic and I see if the test statistic lies in C I reject H naught if it lies in the acceptance region I do not reject H naught okay so this is now a scenario one in this scenario when I talk about scenario I am really referring to the six scenarios or situations that we had seen cases that we had seen when we were looking at interval estimates in this scenario you have the situation where the variance of the population is known and in this case you have a random sample you want to be able to test this claim that mu is equal to mu naught the alternative being that it is not so in this case you construct so alpha is given to you so you can construct the the critical region x bar lies as such okay and so if H naught is true so this tells you the acceptance region so this from minus z by alpha by 2 to z of alpha by 2 is your acceptance region and that probability is 1 minus alpha I hope this is true so I will just flash the board so you can see that the acceptance region is 1 minus alpha and the critical region is alpha by 2 on either ends and so you can write this statement so you would reject H naught if your test statistic which is z naught lies in the critical region and you should not reject H naught if it lies in the acceptance region okay let's look at an example the mean burning rate of a propellant must be 50 centimeters per second given that the standard deviation is 2 centimeters per second experimenter decides to specify a type 1 error probability or significance level of 0.01 5 so 5% you're willing to make an error in your judgment type you know type 1 error you have 25 samples and you obtain an average of 51.3 so you want to say that I have got a sample average of 51.3 but is the population mean indeed 50 so your null hypothesis is this alternative is a two-sided test alternative hypothesis is mu is not 50 alpha is given to you so you need those two critical points z of 0.025 and minus of z of 0.025 and that turns out to be that is 1.96 so in this particular case the test statistic came out to be 3.25 so you know that for the standard normal distribution this value the test statistic lies in the critical region because it is greater than 1.96 and so you say that H naught is rejected at a significance level of 0.05 that is there is a strong evidence that the mean burning rate is not 50 okay all right let me do one more topic before I stop and that is a speed value so there is a problem in this hypothesis test and the problem is that if I had asked you to solve this problem again but I had told you okay can you check what happens at z of 0.0 sorry with alpha is equal to let us say 0.1 percent or 0.1 which is 10 percent okay you'll have to do this entire problem again remember that the data has not changed only the prescription of the level of significance has changed so to overcome this situation see whether I am yes so to overcome this situation we can use a p-value okay so one way to report the results of a hypothesis test and this is what R will do for you is that it will report for you a p-value and the p-value we will look at this a little bit further a p-value is the smallest level of significance that would lead to rejection of the null hypothesis test so if you had chosen a value of alpha which is equal to the p-value then this was the smallest value of alpha that would lead to rejection of the null hypothesis with the given data so note because the data does not change I should not be just having to redo the test just because alpha changes so let us just visualize this so if I find out the p-value then the p-value is in this case using that data I will call this area as p by 2 and this side as p by 2 so the total probability is p so the p-value gives you that value of alpha and the smallest such a value of alpha under which you your data would suggest that the null hypothesis has to be rejected so now if you choose an alpha value smaller than the p-value which means you're on the right side over here and the left side over here then if you were to do that test your null hypothesis would get rejected and if you were larger than that value then you would end up accepting the null hypothesis and that is where you know this statement which is there that if you torture data enough it will confess to almost anything you know statisticians are often you know said that they can move things around to suit their convenience or to suit the convenience of the of the funding agencies or whoever is supporting them so if you choose your alpha values in such a way you could you know make it accept or reject so that becomes a some subjectivity as to how did you choose that alpha value so you can do away with that by saying that I am going to choose a p-value okay so for the propellant example we got the test statistic as 3.25 and at five percent significance level we had rejected it so you could calculate the p-value as for that test statistic so if your test statistic fell here okay this was 3.25 here on the unit normal so I will go symmetrically on the other side this over here is zero okay we're talking about the unit normal then I can ask myself what is this area on either side and this area is nothing but the p-value so it tells me that for that realization of the statistic if I had chosen that p-value and that is calculated here is 0.0012 okay so it tells me that for any value above 0.0012 as the value of alpha you would end up rejecting the null hypothesis that mu is equal to 50 so it gives you it gives it's like a solution key of a problem so it already tells you what is that smallest value of alpha at which the null hypothesis would be rejected so very often people don't do a hypothesis test they take the data they do find the p-value if the p-value is very small like 0.0012 then you know that for any reasonable values of alpha the null hypothesis the data such is not consistent with the claim of the null hypothesis okay so it just tells you that whatever is being claimed so the null hypothesis the data does not seem to support it it is not supportive of the null hypothesis okay so just this connection between hypothesis testing and confidence intervals so I would like you to just write down how did we derive a confidence interval we started with the same test statistic only we put mu in the middle and we found out the two bounds or the confidence bounds or the upper level and the lower level so you see the same statistic is being used here so there is a strong connection between the two so if you have a confidence interval with a lower and an upper values for 101 minus alpha percent confidence interval for theta and you choose a value of do a hypothesis test with theta is equal to theta not and theta is not equal to theta not so if this theta not lies in this interval okay then at the same significance level of alpha you will not reject H not okay and if this theta not does not lie in this interval then at the in that 101 minus alpha percent confidence interval then you will reject H not okay so they are very strong so there is in some sense it is equivalent to be able to do a confidence interval building and a hypothesis test even though equivalent procedure each provides different insights and the nice thing about is this is that you get a p-value which helps you make decisions okay statistical decisions okay so we have talked quite a bit about type one error what about type two error and as I told you type two error is a little bit more tricky because you need to know a scenario okay if mu is not equal to mu not then what is the value of mu okay and so you can think of beta as depending on that scenario what in which case you would like when H not is not true okay what is that value of mu and so you can parametrically find out what is the what is the value of beta so this is in any in many situations in reliability and so on we often build what are known as operating characteristic curves and that basically is a curve between beta and mu minus mu not or some some function of mu minus mu not now you can so again it depends if mu is not equal to mu not then what is it and that is a parameter over here so this beta of mu is the acceptance of H not when H not is false and H one has mu value as some parameter mu so you can again standardize this and calculate that probability under so this probability is being calculated using the mean not being equal to mu not but the mean mu being equal to my some scenario okay so mu is equal to some scenario which is called as mu in this particular case so when you want to find out the probabilities you will have to find the probabilities under this curve and not under the curve mu is equal to mu not so you will have to renormalize this using mu because you are trying to standardize that using that particular value and you should be able to calculate what is that value beta okay this is the type two error probability okay now and most of the times when you do it so this on the x axis we have some measure of how far is that scenario mu from the null hypothesis mu not and this tells you the probability of accepting H not so this is known as the operating characteristic curve so you can see that when the two curves are very far from each other assuming n and sigma are constant then you are over here so this is the case where mu is equal to mu not is here and mu is equal to far away from mu not okay so you can see that the the beta is almost zero so the value of beta over here is almost zero whereas if these two curves overlay on each other okay then the value of beta is very large so in this way you can draw an operating characteristic curve and it gives you an idea about the various scenarios and let us say you are trying to do a robust design where you're designing under uncertainty then you can make use of these different scenarios and make sure that you know like for example people do seismic design of buildings so they take some assumption okay you know you will suffer you will get hit with an earthquake of so much magnitude and you want to be able to design it for that so in the same way you are trying to design for uncertainty and you can make use of the operating characteristic curve and take it as an input in your design okay so okay let me do this particular example let's see what is the next and I can probably stop after that example so if a signal of value mu is sent from location a and then the value received at location b is normally distributed with mean mu and standard deviation 2 that is the so we have seen this problem before you have two stations you have station a from where you're transmitting mu okay and what you're receiving at b is not mu okay it has got smeared with noise and you end up you know the noise has this distribution before is the variance and this signal is sent five times and the average value received at x bar is 9.5 then determine the probability of accepting the null hypothesis that mu is equal to 8 so 8 here is mu naught so this is h naught okay when the actual value sent was not 8 okay but the actual value sent was 10 so this is a scenario where you don't know what is the true value of mu you was testing for mu equal to 8 but what if the actual value sent was 10 okay so that is will allow you to draw the other curve so in this case you can find out so you're trying to calculate beta so it's just an application of this particular formula so you have these two quantities and there must be an alpha value that should be given to us so in this case well this is this is beta so I have yeah the alpha value is given to you over here so it is 0.05 or 5 5% is the value of alpha so for that value your value of beta turns out to be 39.2% okay so it's just an application of that all right so let me stop here I've discussed with you p values I've discussed with you so to summarize I've discussed how to do a hypothesis test we've looked at two-sided hypothesis test and a one-sided hypothesis test remember that you first have to specify what the null hypothesis is the alternate hypothesis the significance level of the test and then the test statistic under question so you can imagine that if you did not know sigma so let us say in this particular problem they told you that the sample standard deviation is 2 and not the true standard deviation or the population standard deviation then you know that the underlying statistic will not be the z variable but it will be the t variable with n minus 1 degrees of freedom so you can apply the same procedure but using the t distribution okay so I will stop here now and I think I'm going to meet you on June 1st so I will do hypothesis test at that point again stay safe and I will see you in the first week of June thank you