 Okay, in the last class, we started talking about this hypothesis testing right, we introduced what is hypothesis testing and we defined what is a rejection set, we talked about various types of error, particularly we talked about type 1 error, type 2 error and we defined type 1 and type 2 error using your power functions, okay and then discussed what should be the ideal hypothesis should look like in terms of its properties for power function, okay. And then we discussed some examples, especially about this binomial case and the Gaussian case so, what we will do today is little bit discuss this example more. So, we said okay let us say we have a sample coming from a binomial distribution which has parameter phi and theta, we could equivalently think of these are like a phi samples coming from a Bernoulli random variable with parameter theta, but here I am just taking one sample coming from a binomial which I could. Now, we want to check whether the theta is taking value less than half or it is taking value more than half, we have to decide a test here. One test like you can always go with the LRT test, okay or like let us discuss to other possible test. Now, since we are discussing whether theta is less than half and more than half, one natural test we can think of is when more number of ones or in fact, all R1s may be that is theta greater than half, otherwise it is less than half that is one possible test and that is what let us define my region R that is the rejection region as something simply one all these ones. So, when all once I see I will reject it to be a null hypothesis, otherwise I accept it to be a null hypothesis. So, this is my rejection region has just one vector in that. Now, if I have to calculate the probability that under my theta, what is that X belongs to R? Now, X is I mean binomial, but I could think of like it is a Bernoulli with phi sample. So, this is nothing but we discussed last time this is nothing but simply probability that X1 is 1, X2 is 2, all the sorry X2 is 2, all the way to X5 is 1, all of them has to be 1 which is because of the independent nature, all of them with happened probability theta. So, this is like a theta to the power 5, ok. Now, let us compute type 1 error for this, ok. What is the definition of type 1 error? Definition of. So, what we basically got is this is the definition of our power function. What, how did we denote the power function? We denote it as beta of theta. So, this is like a beta of theta, ok. So, let us plot it. My theta is going to take value obviously between 0 to 1 and this is my beta of theta, ok. Now, how does theta to the power 5 look like? That is some polynomial curve, let us say it starts from 0 and goes till 1, let us say let us say this is like something looks like this. And let us say this is the half value and this is what my theta naught is this theta naught is actually half for me. So, when theta is less than half and I compute this probability that beta of theta, what this will give me? This will actually give me type 1 error, right. In this region because theta is less than half that is corresponding to the null hypothesis and I am rejecting the samples that is my type 1 error. So, in this region it is giving me type 1 error. Where are what is this is giving in this region? Now, theta here is greater than half that means, it is coming from alternator hypothesis, but this probability is still that of rejection, ok. So, what I am basically doing is I am theta is now in the compliment in the null hypothesis part sorry, alternator hypothesis, but I am actually still plotting this only, right not 1 minus. So, this is actually giving me the compliment of the error probability. So, this is like. So, in this portion this is type 1 error sorry, yeah in this portion it is type 1 error and in this portion it is 1 minus type 2 error, right. Now, we know that this function is increasing in theta, what is the maximum value of type 1 error here is? So, the type 1 error is happening here the maximum value of type 1 error happens when theta is equals to half and that is why we have written maximum type 1 error is half to the power 5 that is 0.0312, ok. Now, let us look into the type 2 error, right. So, what is the type 2 error now? What will be its. So, let us now let us try to actually compute its smallest value, ok. Now, 1 minus this is going to give you 1 minus type 2 error is going to give you or that will be represented by this region, ok. Now, maybe just if I look into the compliment of this maybe it will go something like this, I am just like trying to do the compliment of this, right. This is like this is basically 1 minus theta to the power 5, right. Beta 1 theta is anyway this is defined for every theta and that is theta to the power 5. And since beta 1 of theta is increasing in theta, if when theta becomes largest value 1 then it will take the smallest value that is going to be 0. That means, there would not be any type 2 error if theta is already 1, right. But now the smallest value is going to happen for this when theta is going to be half and when I plug half here that is basically this value, you will see that you are going to get this is actually we saw that last time this is kind of approximately 0.97, ok. So, you see that type 1 error is very small that you rejecting a sample which is coming from a hypothesis from a null hypothesis is very small. On the other hand type 2 error also very high, right. It may happen that your sample is coming from alternate hypothesis, but you may end up accepting it very good amount of time, but that is a bad case for us, right. So, one this is happening because we are looking at a very bad test, we are asking that when everybody is 1 that time only I am going to reject otherwise I am going to accept it to be null hypothesis. So, you can relax this test and say that instead of all to be 1 when the majority are 1 maybe I will reject otherwise I will still accept. So, in that case we are going to reject only when there are 3, 4 or 5 axis successes and you can compute the corresponding power function for that to be like this. And it so happens that you can also plot this function and this will something turn out to be like this. This is for beta 1 and this is for beta 2. So, notice that in this region which is giving me the type 1 error, beta 1 there test 1 is good because it has a low 1 here, but it was taking hit in the type 2 error. On the other hand if you now look into the beta 2, beta 2 has a higher type 1 error compared to sorry this type test 2 has a higher type 1 error compared to test 1, but what about this type 2 error? So, type 2 error is the complement of this, right. So, since this large compared to this that type 2 error is going to be lower. So, there is always a trade off. It is not that one test is going to give you a small type 1 error, it not necessary that it also give you small type 2 error. There may be other test which may give you smaller type 2 error, but it may end up having a type 1 error. So, we would always, but what you would I what you like ideally, we want both type 1 and type 2 error to be small that would be the best case, right. So, but that may not always happen simultaneously reducing them may not be possible. So, we have to keep that in mind whenever we are designing a test. So, we discussed this again I think last time we did the computation of this, right. I am not going to discuss this again when we have this samples coming from a Gaussian distribution with parameter theta and sigma square and where theta is unknown, but sigma square is known we can come up with our rejection region through this condition, ok. I hope all of you followed the derivation of the last time this is for a given c see something parameter that has been given to you. Now I can define now I know my rejection region I can define my power function for a given theta. So, the power function at point theta is simply probability that this ratio is going to be larger than c that is coming from everybody agree till this point basically we know that probability of theta is probability that my x belongs to the rejection region and this rejection region is captured by this that is why we follow. Now I can do a simple manipulation here, right. What I can do is on this left side I simply add theta and minus theta and I return one theta here and the remaining theta minus theta naught minus theta I move to the other side, ok. Why I have to do this? Now if you notice that so, by the way this probability under the parameter theta, right that is the definition say it as, ok. Now what is this quantity now? x bar now all these samples here are coming from PDF with parameter what? So, the samples that I am going to deal with here when I am computing this probability they are coming with the PDF parameter theta that is the meaning of this notation, right. Now so, what will be the expectation of x bar? It is going to be theta, right. What is x bar sample mean? And I am so, that basically have centralized and sigma by square root of n I am normalized it. So, what this random variable is going to be? So, we have seen this much before also, right. This is going to be a normal distribution and that is why I have represented it by z where z is a normal distribution or Gaussian with parameter 0 1 and that being larger than this. And I know how to compute this probability for a given c theta naught theta sigma and the square root n I know how to compute this probability. Any questions so far on this computations? Now I said in the last example both type 1 error type 2 error simultaneously you cannot play because we should try to reduce one other may increase. So, often we may be specified some value I want this much of type 1 error and this much of type 2 error, ok. How to guarantee that then, ok. Now recall that your power function depends on multiple factors one is the c that you are going to use in your test, right. What is the threshold you are going to use in your likelihood ratio test and also how many samples that you are going to use in performing this test, ok. And of course, what is the threshold you are using to define your defining your null hypothesis and the alternate hypothesis and theta is of course, your input parameter in this function. So, if I am going to change this c and n obviously, my power function value is going to change which is in turn is going to change your type 1 and type 2 error, right. So, somehow if somebody ask you ok I want this much of type 1 error and type 2 error then how is it possible for you to give him that kind of type 1 and type 2 error. So, what is in control what is in your control here to through what you can control or change your type 1 and type 2 error here, see ok. If you change c or type 1 and type 2 error will change maybe that is one thing you have in your control and n number of samples that is another thing which in your control by paying with it you can again control type 1 and type 2. Now, if somebody is asking you ok I want this much of type 1 and type 2 error maybe then you can see what is that c and n that can give you that much that kind of a type 1 and type 2 error, ok. Let us look into that. Suppose somebody ask you see I want type 1 error to be at most 0.1, ok and also type 2 error to be at most 0.2 when my theta is going to be greater than theta naught plus sigma. When this some condition you will see why this is coming then somebody is asking you already I want this much of type 1 error and this much of type 2 error. So, the only thing that you have in your control is c and n. Now, you need to decide what c I have to use, what n I have to use so that I can guarantee this, ok. Let us try to see how can I choose my c and n, ok. So, let us get started with this power function we have, this I hope from the previous slide it is all clear to you. Now, let us see this is a let us try to look into this a little more carefully as a function of theta. So, recall z is a standard normal distribution. If I let theta here this theta go to plus infinity, what will be this probability value? Everybody see agree? 1 are you sure? Ok. So, as theta goes to minus infinity this entire quantity becomes infinity minus infinity, right basically. And it does not matter what is the value of c because this guy dominates this c irrespective of what is this, this guy is going to be minus infinity and we know that probability z is greater than minus infinity is 1. On the other hand what value if I let theta go to minus infinity it is going to be 0 everybody agree? So, as I go from minus infinity to plus infinity am I going from 0 to 1 in a linear fashion or like in an increasing fashion? Yes. Yes. Ok. So, if I put this theta. So, as theta goes to infinity this is going to be 1 and as theta is minus infinity this is like something like this as a function of theta. Notice that this is not a CDF function I am plotting it as a function of theta now, ok and this is like 1. Now, based on this understanding can we now decide what should be the value of c and n. See first it is saying first one is type 1 error to be at most 0.1. So, the maximum value of the type 1 error should be 0.1. Now, can you tell me what is the maximum value of type 1 error here and notice that theta this is my theta and here it is below this it is type 1 and above this is 1 minus type 2 error. Now, what will be the maximum value of type 1? So, this is where my type 1 error is and this type 1 error is happening at equals to this, right. So, my maximum value of type 1 error is actually at beta equals to theta what is this value probability that z is greater than or equals to c and which you have been told you to be said to what value 0.1. Now, from this can you find out what is the value of c? How is that? So, you know that z has to be greater than or equals to c is 0.1 you know exactly at what point your standard normal is going to take value 0.1 like at what point if it exceeds it will take value of 0.1 you can find that. So, you find out now value of c. So, this will give you have so far figured out c. Is there something else you need to figure out? Yes. And also you need to figure out how to do that and now. So, for that let us use a second condition that your type 2 error has to be at most 0.2. So, I said type 2 error is like a complement of this it can be like something like just complement of this right when theta is going to be greater than equals to theta. Now, this quantity has to be at most 0.2 when theta is going to be larger than theta plus. Now, if I look into the type 2 error is type 2 error is going to be increasing or decreasing function in theta. So, type 2 error is now going to follow from 1 minus beta theta right and we know that if beta theta is increasing 1 minus beta of theta has to decrease and its maximum value happens when theta equals to theta naught. But we have been told that when theta is greater than theta naught plus sigma that time this value should be at most 0.2. So, you just plug in theta equals to theta naught plus sigma you will get. So, what you want is I want this quantity at theta to be at theta naught plus sigma to be 0.2 and now let us see what is beta of theta plus sigma if you plug in if you plug in here this is like a theta right this will knock off this theta I will get minus sigma that will cancel with this. So, I will get probability that z is greater than or equals to c minus square root n to be do I get this condition or maybe let me write this like a properly. Now, p minus z is greater than or equals to c minus square root n equals to 0.2 and c you have already decided from this. So, plug in that c and now what is remaining is only n is the quantity that is remaining and again by using your tail distributed properties of your CDF like CDF of your standard normal distribution you should be able to find out what is n ok. So, if somebody has to ask you to control they control your type 1 and type 2 error the thing that is in your control is what is the c value you can choose and how many samples that you want to guarantee that type 1 and type 2 errors ok. So, if you fix a number of sample maybe it is almost not possible or almost impossible to make both type 1 and type 2 errors to be small ok. That is why often in practice what we do is we say that let us my primary criteria is going to be on type 1 error let us guarantee that type 1 error is always going to be less than this much and after that see that what is the smallest type 2 error you can get ok. So, in a way if you want to look this into the from the optimization point of view what you want to do is you have been asked to give type 1 error to be let us say some quantity let us call let us call that to be alpha and now what you want to do is minimize your type 2 error. So, you want to find a test. So, this is across all tests. So, for every test is going to be associated with type 1 and type 2 error and you have been already told ok type 2 error cannot be that is my hard constraint type 1 error cannot be more than alpha if you satisfy that is good, but I will be also more happy if you make my type 2 error also small ok.