 So, we started talking about hypothesis testing and we said that hypothesis is a statement about a population parameter and we said in a hypothesis there are actually two hypothesis H 0 and H 1 which are complementary and the H 0 hypothesis is basically a statement like my parameter theta belongs to some set big theta 0 and that we are calling it as null hypothesis and the alternative hypothesis is my theta belongs to the complement class which I am calling it as null hypothesis and if my thetas are like a real line I may put a partition somewhere like if my theta space is like a real line I may say ok I am put this as my theta 0 and ask whether I this is below this or above this ok. So, in hypothesis testing we are going to come up with a rule that tells whether my hypothesis need to be accepted or rejected. So, the rule is going to give me tell me how to make a decisions and one of the possible decisions criteria we came up is based on likelihood ratio test which we called LRT ok. So, LRT is defined as ratio of two quantities which are defined in terms of your likelihood function. The numerator is optimizing your likelihood function over theta naught which is basically the parameters belonging to your null hypothesis and the denominator is optimizing over the entire parameter space which is theta 0 as well as theta 0 complement. So, this ratio whatever we are going to get for a given samples x we denote it as L of x sorry lambda of x and our decision is going to be now if lambda x is going to be less than or equals to c I am going to reject it to be a null hypothesis and accept it as an alternate hypothesis and notice that there is a c here which is the parameter that you are going to set for making this decision. And what will be the range of lambda x here obviously denominator is going to be larger than the numerator ok my likelihood functions are positive. So, this ratio is going to be between 0 1. So, my c will be also for some range between c 0 1 we will take ok. So, we calculated last time this example about the binomial random variable sorry yeah we did it for a Bernoulli random variable with parameter p and our hypothesis were whether it is a fair coin or an unfair coin ok. So, what we will do is today we will do one more exercise on the let us say your samples now ok let us take n sample and assume that this x i's are coming from from mu sigma square ok and you want to ensure that. So, fine your x i's are i i d coming from Gaussian distribution with parameter mu and sigma square and now your goal is to identify your hypothesis on this parameter mu and let us assume that this sigma square is known ok. And now I want to make two hypothesis my null hypothesis is theta is going to be less than or equals to some theta naught ok and my h 1 is theta is going to be greater than this theta naught fine. Now, first I want to now find out a likelihood ratio apply a likelihood ratio test on this and I want to find out that likelihood what is that likelihood ratio and for that I need to first find out my likelihood function ok. So, what is the likelihood function of theta given x we know this is nothing, but 2 pi sigma square n by 2 exponential minus x i minus mu r theta here theta square divided by 2 sigma square ok. So, now let us write lambda of x this is sup over theta divided theta less than or equals to theta naught this whole quantity and sup over entire theta l theta by x ok. Now, first let us do the bottom one denominator 1 because that is unconstrained whereas, the numerator is the constrained one right. So, if you if you try to optimize over theta what is that theta that will optimize this likelihood function sample mean. So, that is the one the one the denominator is going to be sample mean what is the one that is going to optimize the numerator why anybody can think why that is going to optimize at theta naught. So, just think about this whether this quantity here is increasing or decreasing in theta decreasing ok let us say if you increase theta x i minus theta is going to decrease and with a minus sign increase. So, what is the effect of increasing theta on this quantity is it going to increase, but notice that it is a square here it is not a linear term here does the square bother why do not I just go out quickly what is that see this is the only parameter theta is coming here right I do not need to worry about this guy is a constant for me because sigma square is known and now only thing you have to optimize is this quantity right and here I am interested in maximizing, but if you ignore this minus sign you have to just minimize this quantity find out what is that theta that minimizes this ok maybe let us quickly do that. So, if I differentiate this is going to be simply x i minus theta times minus 1 right I am just differentiating this quantity here everybody is in sync with me differentiating exponent I am just trying to differentiate the exponent this is the exponent here or maybe let me keep that minus here this minus I also kept. So, 2 sigma square now so, this quantity becomes what is become minus minus summation x i minus n theta naught sorry n theta ok. Now, you have also that constant theta is less than or equals to theta naught right how you are going to handle this constraint or to also get not ok fine. So, how you do how do you handle this theta less than or equals to theta constraint you want to construct a Lagrange here or not necessary ok let us let us simplify this problem ok let us instead of this I will make your life easier. Let us only consider this problem here theta is equals to theta naught and theta is not equals to theta naught is this simpler I am only considering the hypothesis whether theta is theta naught or not. So, then in the numerator there is nothing to optimize this is actually at theta naught ok and then what I will do is maybe I will because of this simplification I will not go into all these circles. So, then the numerator is going to be simply I will get rid of this constant because that anyway get cancelled in the numerator and denominator. So, exponential minus x i minus theta naught square 2 sigma square divided by exponential minus x i and what is that theta naught that is maximizing sorry what is that theta that is maximizing this that is exactly x bar. Everybody agree with this now if you simplify this this is nothing, but exponential minus x i minus theta naught square minus x i minus x i bar over square divided by 2 sigma square this is my lambda x this is my likelihood ratio test. Now, I want to set my rejection region as all x such that lambda of x is less than or equals to c for some given c. I am going to set given c which is belonging to 0 1 or maybe I should keep it open interval 0 and 1 ok. So, now, let us write this what is going to happen. So, now if I I want to now equate this to less than or equals to c if I do this what is the condition I am going to get minus x i minus theta whole square plus the summation will retain x i minus x bar whole square and I will take the exponential other side this is going to be log c right and there is also that 2 sigma square factor is there, but I think I should not take it like this I will just keep it like this too soon. Is this inequality fine? Now, what I will do is I will take this minus sign with 2 sigma square on the other side. So, you will end up x i minus y this is supposed to be theta naught this is the fixed theta naught theta naught minus x i minus x bar whole square. Now, this inequality become greater than or equals to right minus 2 sigma square log c. Now, is there any simpler expression for this say that again x i x bar minus theta naught whole square oh that is it this will just happen to be x bar minus theta naught do you do you agree that it is going to be simply x bar minus theta naught this left hand side. Why do not you quickly work out? Okay I also want to work out this okay let us do. So, the left hand side is going to be x i square summation plus n theta 0 square plus minus 2 summation x i theta right the first term and the second term is going to be summation x i square plus n x bar square then minus 2 x i x bar okay this whole thing and now this guy gets knocked off with this and I will club these two guys together n theta 0 square minus x bar square and now I am going to these two guys. So, these two guys how I can club. So, 2 I am going to divide this guy by n this sum because theta 0 is a constant I will divide it by n. So, I am going to get x bar theta naught this guy to this and this guy here x bar is also constant for me right from the data. So, that is going to be minus 2 x bar n x bar you are right this should be plus okay and this you simplify this is going to be n theta 0 square x bar square minus 2 n x bar 2 n x bar I have taken and theta naught minus x bar is this correct. And this is it now we have this it is not as simple as as you said that it is simply theta yeah you want to write it as theta minus x into theta plus x I mean product of theta minus x into theta plus this okay let us do that. So, I am going to take theta x bar outside and and n is anyway common this is going to be theta plus x bar and minus n is already out 2 x bar okay. So, fine this is going to be n oh is it coming to that guy okay theta naught minus x bar is it correct okay. So, you are his friend want to prove him right okay fine then this is going to be n theta 0 minus x bar whole square and this. So, final condition is theta 0 minus x bar whole this should be greater than or equals to minus 2 sigma square log c this is my condition okay. If you want to further simplify you can take the square root both sides divide take this n on the other side and you will get everything just in terms of if x bar is greater than something but this is the condition you are going to get. If your x bar is such that this condition happens then you are going to reject it that that that is equals theta equals to theta 0 and if this condition is not true then you are going to accept that that is indeed has a parameter theta equals to theta naught okay fine any questions on this okay now let us look into if I want to start with the likelihood ratio sorry if I have if I have already have a sufficient statistic and if I want to use it what is the connection between that and my likelihood ratio test okay. So, let us say T of x is a sufficient statistics on my sample x which is drawn from a population f x given theta okay now our likelihood ratio test is this numerator and denominator and by definition this L of theta x we are simply taking it to be the pdf right but now we are talking about having a sufficient statistics L sorry T and we if we recall one of the properties of sufficient statistics is that we can split my pdf into two parts h and g where h is a function of x alone and g is a function of x only through t. So, I can write this in that case I can quickly knock off this h of x because it has nothing to do with my parameter theta now what I have is g of t of x if t is a sufficient statistic and g is another let us assume let us assume for hypothetically g is a one to one function what is this is g of t is so we have said right like if t is a sufficient statistics and we are going to apply another function on this this g composition t of x this is let us call this is t prime this t prime is again a sufficient statistic. So, now we have a ratio in terms of two sufficient statistics okay you can calculate I am simply writing it as L star now I have written it as L star maybe I should have yeah L star now this is a function of this given sufficient statistic and now I am going to call got a new expression for my ratio test in terms of this sufficient statistics that is what I am calling it as L of t of x okay. So, this is just I am denoting like when I am involving sufficient statistics t in my computation this likelihood ratio is coming from this sufficient statistic t okay. Okay now you can do all the things you want like you you just define your boundary rejection ratio always like maybe we should write the rejection ratio with a subscript c because that depends on that parameter c if you change your this is all x such that lambda x is going to be less than or equals to c if you are going to change your c your rejection region can potentially change okay okay this is one way. Now this is kind of assuming a frequentist approach right like assuming that okay theta is something some fixed parameter which I do not know and that is the parameter which is governing my pdf okay, but we have always seen that instead of assuming theta to be fixed we could be assuming that that is coming from another distribution itself and in that case we use this bias approach. So, the bias approach could be also used in the hypothesis testing how to use it okay let us see. So, in the bias hypothesis testing what we do we are going to assume some prior probability on my parameter theta that is always we do and after like you observed some sample x maybe your posterior is going to be phi of theta given x. So, that we have computed right how to compute the posterior given my prior information and an observation x okay. Now I can now define my new hypothesis instead of exactly whether it is except it belongs to this or this instead of asking these binary decisions we can make it probabilistic decision now whether tell me the probability that the observation I am making is coming from this parameter space. Earlier it was s no but now we are putting some probability on the s and some probability on the no okay. So, that is why we are saying now once we have this posterior probability after observing data or your sample you can ask the question what is the probability that theta belongs to my null hypothesis given my observation x is can you compute this probability from your posterior probability. So, once you know given x this posterior probability is giving you probability distribution of this theta now all I am asking is that it belongs to this set theta 0 I should be able to compute this right by integrating this probability over my space of theta 0 and similarly I should be also able to answer the question what is the probability that theta belongs to my complementary set using my posterior probability. Okay, now we can assign instead of asking whether h naught is true or not I can go into ask the question what is the probability that h naught is true given x what is the probability that h naught is false here oh sorry this is correct this is h 0 and what is the probability that h 1 is true given x now these are probabilistic arguments. Okay, now the decision criteria has to come earlier my decision criteria was I was putting some threshold on my likelihood sorry l i t function now earlier my decision criteria was like this where I could probably compute my lambda x but here I am computing the probabilities given an observation x what is the probability that h naught is true and h 1 is true. Now, what would be a what would be a natural candidate to make a decision here whether h naught is true or h 1 is true yeah you just may want to go like if this probability happens to be higher than this then you will say that okay h 0 is true otherwise h 1 is true but then that so you will simply say that h 0 is true if theta belonging to theta naught given x is going to be larger than theta belonging to this complement set given x and otherwise h 1 is true or alternately you can just say that if I have to put it in this structure I will compute this probability and if this probability is greater than half then I am going to reject it or basically accept the h 1 hypothesis. So, now this lambda x is being replaced by the probability that my parameters are coming from null hypothesis given my observation x okay. So, in the frequencies approach we will just construct my likelihood ratio test and in this a Bayesian approach I am just going to compute this probability based on my posterior distributions okay okay. So, this part on the Bayesian test okay any questions on this yeah you can make it strictly greater than if you are going to set it like this maybe you can just make it strictly greater than that does not matter usually we are going to be like yeah that matters if your theta space is discrete but most other times we will be working with theta space which is continuous in that case it does not matter.