 Now, what is the variance of W prime? You can again compute variance of W prime, you have to expand it. Let us try to see if there is something simpler here variance, I do not think maybe I will just write it from here. Let us see quickly, I do not have it here. The variance of this is expected value of W prime minus sigma square whole square. This is by definition. Am I correct in this? What is wrong in this? Is this correct? Variance of W prime is this? What is wrong? Sigma, right. So, instead of this, I cannot write sigma square here, I have to have to write expected value of W prime here and this square is what is the variance of W prime. And you can actually compute this, I am simply going to write it again to n minus 1 divided by n square times sigma 4, ok. And now, if you want to compare it with our previous thing, this was 2 sigma 4 n minus 1 and now what we have is 2 n minus 1 by n square sigma 4, you can simplify this as 2 n minus 1 divided by sigma 4. So, only thing you need to verify is 2 n minus 1 divided by n square, I can upper bounded by 2 to the power n minus 1. So, now what? I have this quantity here. Now, 2 n minus 1 by n square into sigma square into sigma 4 and plus minus sigma square by n. So, by the way, we made one more mistake, right. We wanted to take square of this, that is by definition, right. So, we should be taking square of this. So, now let us compare it with what we had for expected value of W. Expected value of W was simply 2 to the power n minus 1 sigma 4, ok. Now, let us compare one expression we have for this and this is for W prime. We just argued that this quantity here is going to be smaller than this quantity. So, the first term is going to be smaller than this quantity, but there is another quantity that has come here because of this biasness. Can I say that the mean squared error of W prime is going to be less than mean squared error of W? By the way, this is not this is this is not this is basically basically this is like variance, sorry. This is like this is like m s e of your W and this guy what I have here is m s e of your W prime. Now, can we say anything like whether m s e of W prime is going to be smaller than m s e of W? We know that the first term is going to be smaller, but there is also second term here which is adding positively to it. So, it is unclear, right? Whether W prime is going to have a better mean squared error or W is going to have a mean squared error. So, it is not necessary that if you have an unbiased estimator, it is going to also give you smaller mean squared error. It may happen that here W prime which is biased here which may end up to be smaller than may have a smaller mean squared error than your unbiased estimator. So, anybody has any doubt in this example that we did? Is this you are able to comprehend what I have written here? So, is the motivation clear? I may have an unbiased estimator, its bias is going to be 0 fine, but when it comes to its mean squared error, it may not be the lowest one. I may have another bias estimator which may give me smaller mean squared error, ok. And I want to estimate my I want to evaluate my estimator with respect to the mean squared error part, ok. So, then the question comes, what is the smallest mean squared error I can get, ok. And that is where the Kramasarab bounds comes into picture which kinds of provides me some bound on the variance of my estimators, ok. So, fine, let me give you two like let us say I gave you W and W prime. I said that W is unbiased, but W prime has a smaller mean squared error. Which one you choose? You like unbiased? I mean W you want. Anybody here who feels W prime should be better? W prime should be better, why not W? That depends, right, like I mean. So, what is W being unbiased says like on an average it is giving the right value. Unbiased is not saying that, right, but what unbiased is saying on an average I will be closer to the point, ok. Now, let me ask you another question. Let us say W and W prime both are unbiased. Which one you are going to choose? Or let us say W and W prime are both unbiased and W prime has a smaller mean squared error. Which one you are going to choose? W prime is. W prime is. If that is the criteria, if once I said both of them are unbiased, so their bias term is going to be 0 and in that case what matters to be is only the variance of the estimator. Because the second term is anyway 0, then I should be worried about only which of this estimator is going to be the smaller. Now, that is where the Crammas bound is coming and it is saying that, ok, what is the smallest variance one we can expect, ok. So, Crammas bound says that suppose if we have a samples coming from a distribution pdf parameterized by theta and Wx is any estimator. Right now we are not saying either a biased or unbiased. Then it is saying that and we are assuming that its variance is going to be finite. If that is the case, the variance of that estimator has to be at least this much. It is not that I am going to achieve an arbitrarily small variance. If it is an unbiased, I mean whatever the estimator here, there is some lower bound which is going to be governed by this. And this holds, I mean this bound comes under certain condition. One condition is the expected value of W. I should be, oh there is a typo here, this should be d by d theta here. When you look into the derivative of the expected value of W of x as a function of theta, this should be like this. So, what is this basically condition telling, can anybody look, can anybody see what this condition is asking for? Ok, let us write the left hand side here. What is this left hand side? If I have to expand, this is d by d theta of integration of W of x f of x given theta d of x over x right, the left hand, that is the expected value. Now, what is the difference between right hand side and the left hand side? Yeah, the derivative I have taken inside. So, I basically saying I will be able to interchange my integration and derivative ok. So, is it possible that you can interchange this integration and differentiation? Not always. Not always, but that is why this condition whenever this is possible, whenever your pdf function satisfies that, then what to your condition? We are saying this. Let us have a quick discussion about how this Cramer bound comes out. Actually, this Cramer bound, Cramer row bound is an intelligent application of your covariance bound on your covariance function or bound on exploiting the property of your correlation function, correlation coefficient. So, by the by the way, what is the correlation coefficient? Correlation coefficient of two random variables x and y. What it is? Covariance between x and y divided by the standard deviation of x and y or square root of the variance of x and variance of y. And we know that this ratio has to be always minus 1 and n. So, if I have to take the square, I can get this condition ok. And now, I am going to this is going to give me one way to get a lower bound on the variance term. So, now, I am by manipulating this variance of x is covariance of x y whole square divided by variance of y. Everybody agree with this? Now, what Cramer row bounds is doing is appropriately define your x and y random variables to get a to get this bound ok. So, let me see if you can quickly argue this fine. We agree with this. Now, let us start with this function d by d theta of this quantity. This is our assumption. So, what we will now do is x d by anyway this w of x is a function of x, but x I think I should have written x given theta here f of x given theta is a function of theta. So, what I will do is w of x d of d of x f of x given theta ok. And then what I will also do is to this I am going to add and multiply x given theta. Everybody agree what I have done nothing right I have just been doing a algebraic manipulation. Now, this quantity here this whole quantity here can I write it as log of this quantity x w of x log of sorry d by d theta of log of f of x given theta right f of x given theta still remains. So, now, I am going to look into this quantity. Now, this is nothing but expectation of w of x w x log sorry d by d theta of log why I do not have space here. So, this quantity is nothing but here expected value of w of x sorry capital X d del del del theta log of f of x given theta. And this is see notice that x and this x both are capital letters because now they are random variables. So, this is one random variable here and this entire thing I have taken it as a another random variable. Now, everybody agree with this relation what we basically did this quantity d by del theta we are able to show this quantity. Now, this should hold for any w x right we did not make any assumption what of w x is and what now we will assume is I am going to assume a estimator which is constant all the time w of x equals to 1. So, this relation should hold for even w of x equals to 1 ok. Now, let us see for this w of x my LHS is going to be what is this LHS is going to be expectation of 1 it is going to be 1 and what is d by d theta of 1 0 because 1 does not depend on theta right. So, in in d by d theta of this is 1 which is 0 and what is on the right hand side the right hand side is expectation of w x I have set it to be 1 it is del upon del theta log of f of x given theta this is my RHS. So, what I showed this quantity here now is equals to 0 everybody agree this is my right hand quantity and left hand quantity they are equal irrespective what is your w this is the case. Now, this quantity here I can interpret as this is a correlation between w of x and this other random variable can I interpret like that and now the covariance between w of x and this quantity is nothing but expected value of their product minus expected value of w x and expected value of this quantity which is already shown to be 0. So, the other quantity is 0. So, only that is why this quantity is going to be this much. So, now that is it we plug in all these quantities here. So, what we did is ok now let us go back here sorry variance x I have taken it to be x I have taken it to be I have calling w of x to be x and I am y I am going to be calling it as d upon del theta log of f x given theta this covariance of x comma y that is w x and this quantity I have just demonstrated it to be this quantity and what we want to say is yeah we actually said that this quantity right this quantity here is nothing but d by d theta expected of w of x we just argued right this quantity is nothing but d by d by d theta expected. So, I am going to replace this quantity the numerator here by this quantity everybody agree with the numerator ok. Now, the denominator denominator is what del d del upon del theta log of f of x given theta ok, but what I actually have in the denominator here is variance of y what is variance of y? I know that variance of y is expectation of y square minus expectation of y whole square, but now y is we already said right y is this quantity and we have just argued that its expected value is 0. So, what matters is only expectation of y square and I have just put it here ok. So, that is how the Kramarov bounds work and it so happens that if my samples x are IID samples then one can simplify this instead of taking the vector we can take it as n times this quantity that is a simplification ok. So, with this we have this lower bound. So, I have just skipped this step, but this step here y n comes it follows from this simplification just to go and work out that because this x as our IID I should be able to write it as a product of individual terms because of the log I get the summation ok and then I have squared here when you take a square and expand you will get expectation of these terms whose value is 0 ok and because of the summation adding more and you will get the n term.