 in module 1 and on simple linear regression. In the first lecture, we have introduced a simple linear regression model and we have learned how to estimate the regression coefficients using least square technique. Here is the content of today's lecture. First, we will give one example on simple linear regression and then we will talk about useful properties of least square feet and then the statistical property of least square estimator. Well, let me just recall the simple linear regression model. The general form of simple linear regression model is y equal to beta naught plus beta 1 x plus epsilon, where y is the response variable, x is the regressor variable and epsilon is the error term and beta naught is intercept and beta beta 1 is slope and we call beta naught and beta 1, I mean they are regression coefficients. Now, given a set of observations say x i y i for i equal to 1 to n, we learned how to estimate the regression coefficients beta naught and beta 1 using least square technique. So, least square method determines the parameters. Parameters means the regression coefficients beta naught and beta 1 by minimizing residual sum of square s s residual, which is equal to E i square i equal to 1 to n, which is basically equal to the difference between the observed response value and the estimated response value y i hat. So, this is the residual and y i hat this is y i minus y i hat is equal to beta naught hat plus beta 1 hat x. So, this is beta naught hat minus beta 1 hat x i whole square, because the fitted equation is y i hat equal to beta naught hat plus beta 1 hat x i. So, and we learned that beta naught I mean this s s residual or residual sum of square is minimum, when beta naught hat is equal to y bar minus beta 1 hat x bar and beta 1 hat is equal to summation x i minus x bar y i minus y bar by summation x i minus x bar whole square. Well, this quantity is also denoted by the symbol s x y by s x x and this one can also be written in the form summation x i y i minus x bar y bar n times by summation x i square minus n into x bar square. This is not difficult to observe just a simple algebra shows that this quantity is equal to this quantity. So, let us move to the Disney toy example. This is the cost on advertising and this is the sales amount. So, these are the x i values and these are the y i values and we want to fit a straight line model to this data. This table shows summation x i equal to 15 summation y i equal to 10 summation x i square equal to 55 summation y i square equal to 26 and summation x i y i is equal to 37 and then we can compute the value of beta 1 hat. So, beta 1 hat is equal to summation x i y i minus n times x bar into y bar by summation x i square minus n times x bar square which is equal to 0.7 and similarly beta naught hat is equal to y bar minus beta 1 hat x bar which is equal to minus 0.1. Well, so the fitted equation is y hat equal to beta naught which is equal to minus 0.1 plus 0.7 into x. So, this is the fitted equation for the given observations. Now, what is the interpretation of this regression coefficient beta 1? So, it says that the expected value of the response variable or expected sales amount is increased by 0.7 units for each 1 dollar increase in advertising and the interpretation of the coefficient beta naught hat is that what is the average sales amount when x equal to 0. So, beta naught is equal to y hat when x equal to 0. So, beta naught hat this basically gives some idea about the average value of sales amount when the advertising cost is equal to 0. So, I mean it is very difficult to explain why it is so because see we can expect some sales amount without advertising also, but here it is negative anyway. So, beta naught hat is the average sales volume which is equal to minus 0.1 when the advertising cost is equal to 0. Well, next we move to the useful properties of least square feet. It says that we know what is residual. This E i is the difference between the observed response value and the predicted or the estimated response value. So, it says that the sum of residuals in any regression model that contains an intercept beta naught is always 0. I am going to prove this one. So, this summation E i is equal to 0 and the second property says that the sum of the observed value is equal to the sum of the fitted values y i hat. So, I mean this second property is a consequence of the first property it is obtained from this one only. Let me prove the first property which says that the sum of residuals in any regression model that contains an intercept beta naught is always equal to 0. So, what we have to do is that we have to recall the least L S method least square method. Least square method determines the parameter beta naught and beta 1 by minimizing by minimizing S S residual that is residual sum of square which is equal to summation E i square. This is basically equal to summation y i minus y i hat and this is equal to summation y i minus beta naught hat minus beta 1 hat x i square. Well, so least square method or L S method determines the parameter beta naught hat and beta 1 hat by minimizing this quantity. So, what we do is that we differentiate the residual sum of square with respect to beta naught hat and we equate that with 0 and we get the first normal equation. The normal equations equations there are two normal equations we differentiate S S residual with respect to beta naught hat that gives one normal equation and again we differentiate residual sum of square with respect to beta 1 hat that gives another normal equation. The first normal equation you differentiate S S residual with respect to 0 sorry with respect to beta naught hat that gives summation y i minus beta naught hat minus beta 1 hat x i two times of this and one negative sign here. So, and we equate this two equal to 0. So, this is the first normal equation and which is nothing but see this is nothing but y i minus y i hat. So, this one is nothing but summation E i equal to 0. So, the first normal equation says that the sum of the residuals is equal to 0 and this residual sum of residual E i is nothing but summation. So, this is the first property the second property says that well. So, this is the sum of residual is equal to y i minus y i hat this is the residual i th residual and we know this is equal to 0 from the first property which implies that summation y i is equal to summation y i hat. So, this one is the is a consequence of the previous property or the first property it says that the sum of observed values equal to is equal to the sum of the fitted values. So, next move to the third property it says that summation x i E i equal to 0. That means, sum of residuals weighted by the corresponding value of the regressor variable is equal to 0. So, sum of residual E i weighted by the corresponding value of the regressor variable this is the value of the i th regressor variable this is equal to 0 and the fourth property says that summation y i hat E i is equal to 0. That means, the sum of residuals weighted by the corresponding fitted value of the response variable is equal to 0. Well, let me prove these two properties property 3 and property 4 well we by differentiating this residual sum of square with respect to beta naught hat we got this normal equation this is the first normal equation. Now, again we differentiate this normal equation sorry this sum of residual with respect to beta 1 hat and that gives summation y i minus beta naught hat minus beta 1 hat x i and we are differentiating with respect to beta 1 hat. So, this quantity will be multiplied by minus 2 into x i. So, minus 2 into x i which is equal to 0 this is the second normal equation and this one is nothing, but summation y i and this part is y i hat into x i equal to 0 which implies summation E i x i equal to 0 which is the third property. Now, the fourth property is summation E i y i hat this is equal to 0, but summation E i y i is not equal to 0 you should note that how to prove that summation E i y i hat equal to 0. This is again the consequence of the first property and the third property well this can be written as summation E i what is the y i hat y i hat is equal to beta naught hat plus beta 1 hat x i which is equal to 0 which is equal to which is equal to summation beta naught hat E i plus beta 1 hat summation E i x i. Now, see summation E i from the first property this is equal to 0. So, this is equal to 0 plus the third property says that summation E i x i equal to 0. So, this is basically beta naught hat into 0 plus beta 1 hat into 0. So, this is equal to 0. So, we prove that summation E i y i hat equal to 0. So, these are the some properties of least square feet and we will be using them in future next let us move to the statistical properties of least square estimators. So, we have estimated the regression coefficient beta naught and beta 1 using least square estimator and we are going to prove that both beta naught hat and beta 1 hat are unbiased estimator of beta naught and beta 1 respectively. Well let me prove that first I will prove that beta naught hat and beta 1 hat are linear combinations of observations. That means, we want to say that they are linear estimator. For example, consider beta 1 hat. So, what is beta 1 hat? Beta 1 hat is equal to summation x i minus x bar into y i minus y bar by summation x i minus x bar whole square. Now, I said that this is a linear combination of the response variables y i's or the observations y i's. So, this can be written as summation x i minus x bar into y i only by summation x i minus x bar whole square. This is easy to I mean these two quantity are same. One can very easily prove that summation x i minus x bar into y bar that quantity is equal to 0. It is not difficult to prove that. Now, this is from i equal to 1 to n and i equal to 1 to n. Now, this can be written as summation c i y i where c i is equal to here i equal to 1 to n where c i is equal to x i minus x bar by summation x i minus x bar whole square. So, I proved that beta 1 hat is a linear combination of the observations y i. Similarly, one can prove that beta naught hat is also what is beta naught hat is equal to y bar minus beta 1 hat x bar. So, this is also a linear combinations of the observations because see this is nothing but summation y i from i equal to 1 to n 1 to n and minus beta 1 hat x bar. We already proved that beta 1 hat is a linear combination of the observations y i and the first term is also a linear combination of y i. So, the whole thing is a linear combinations of the observations y i. So, this just proved that the estimator we got they are linear estimator. They are linear in y i next let me prove that the beta naught and beta 1 they are unbiased estimator that is we are going to prove that expectation of beta 1 hat is equal to beta 1. So, if this is true then we call beta 1 is an unbiased estimator. Well, let me start from here the simple linear regression model is y i is equal to beta naught plus beta 1 x i plus epsilon i. Now, y bar which is equal to summation y i i equal to 1 to n this is equal to beta naught plus beta 1 x i will be replaced by x bar and epsilon i will be replaced by epsilon bar. So, where x bar where x bar is of course equal to summation x i 1 to n and epsilon bar is also equal to summation epsilon i 1 to n then from here y i minus y bar is equal to y i minus y bar is equal to beta 1 into x i minus x bar plus epsilon i minus epsilon bar. So, the expected value of the expected value of y i minus y bar is equal to see one thing you should observe that you should always remember that y is a random variable the response variable y is a random variable, but x is not a random variable it is a controlled variable. So, for given i this is just a constant. So, expected value of y i minus y bar is equal to beta 1 x i minus x bar plus expectation of epsilon i minus epsilon bar. Now, epsilon i the random error which follows which is a random variable and this follows we assumed that epsilon i follows normal 0 sigma square. So, the expected value of e i is equal to 0 and similarly the expected value of epsilon bar is also equal to 0. So, the expected value of y i minus y bar this term is going to be 0 is equal to beta 1 into x i minus x bar. Now, our aim is to prove that beta 1 hat is an unbiased estimator of beta 1. So, expectation of beta 1 hat is equal to expectation of what is beta 1 hat beta 1 hat is equal to summation x i minus x bar y i minus y bar by summation x i minus x bar whole square. So, this one is nothing but summation x i minus x bar into expectation of y i minus y bar by summation x i minus x bar whole square. Now, note that we prove that expectation of y i minus y bar is equal to beta 1 into x i minus x bar. So, expectation of beta 1 hat is equal to summation x i minus x bar and this quantity is equal to beta 1 beta 1 x i minus x bar by summation x i minus x bar whole square. So, this is nothing but beta 1 only because this one is nothing but x i minus x bar whole square. So, we prove that beta 1 hat is an unbiased estimator of beta 1. Similarly, next we prove that beta 1 hat sorry beta naught hat is also unbiased. That means we are going to prove that expectation expected value of beta naught hat is equal to beta naught. This is equal to expectation of y bar minus beta 1 hat x bar. So, what is y bar? y bar equal to we proved in the previous slide that y bar is nothing but beta naught plus beta 1 x bar minus beta 1 hat x bar. So, this is equal to expectation of beta naught is beta naught expectation of beta 1 x bar is equal to beta 1 x bar. Now, the expectation of beta 1 hat just now we proved that the expectation of beta naught beta 1 hat is equal to beta 1 hat sorry beta 1. So, this is equal to beta 1 x bar which is equal to beta naught. So, we proved that both beta 1 hat and beta naught hat are unbiased. Next we will talk about the variance of the variance of beta 1 hat and beta naught hat. The variance of beta 1 hat is equal to sigma square by s x x and the variance of beta naught hat is equal to sigma square into 1 by n plus x bar square by s x x. Anyway I mean we need to know how to how to derive these things well. So, next we talk about the variance of variance of beta naught hat and beta 1 hat. So, the variance of beta 1 hat is equal to the variance of what is beta 1 hat? Beta 1 hat is equal to summation x i minus x bar into y i minus y bar by summation x i minus x bar whole square. Now, this can be written as the variance of summation x i minus x bar into y i by summation x i minus x bar whole square. And this is nothing but see I before also I proved that this estimator is a linear combination of the observations y i. So, the variance of this quantity is and since y i is are independent you know well. So, this is equal to variance of summation c i y i where c i where c i is equal to x i minus x bar by summation x i minus x bar whole square. Now, we know that the variance of variance of summation c i y i is equal to summation sum of variance c i square variance of y i. This is true because y i's are independent and we know that variance of y i is equal to sigma square. So, this is equal to summation c i square sigma square. Now, what is summation c i square? Summation c i square we know this is c i. So, variance of beta 1 hat is equal to sigma square into summation c i square which is equal to summation x i minus x bar c i square. So, summation x i minus x bar whole square by summation x i minus x bar whole square and square of the whole things. So, I am just replacing c i square by its value and this becomes sigma square by summation x i minus x bar whole square which is equal to sigma square by s x x by notation. So, we proved that variance of beta 1 hat is equal to sigma square by s x x. So, next we talk about the variance of beta naught hat variance of beta naught hat is equal to variance of y bar minus beta 1 hat x bar. Now, this variance can be written as the variance of y bar plus the variance of beta 1 hat x bar minus 2 times x bar the covariance of y bar and beta 1 hat. Now, you know the variance of y bar I hope you know it is equal to it is not difficult to prove that this is equal to sigma square by n. Let me prove this variance of y bar is equal to the variance of 1 by n summation y i. Now, y i's are independent that is the assumption we made at the beginning because e i's are independent. So, epsilon i's are independent. So, y i's are also independent. So, the variance of this quantity is equal to summation of variance of y i's by n square right and variance of y i is equal to sigma square. So, summation sigma square n times 1 to n by n square. So, this is basically n sigma square by n square this is equal to sigma square by n. The variance of y bar is equal to sigma square by n we know the variance of beta 1 hat and this is a constant quantity. So, the variance of this quantity is x bar square into variance of beta 1 hat which is equal to which is equal to sigma square by s x x just we proved. Now, what about this covariance this is going to be this covariance is going to be 0. So, this is 2 x bar into 0 into 0, but we need to prove this one. The covariance between this can be proved that the covariance is equal to 0. The covariance between y bar and beta 1 hat is equal to the covariance between what is y bar y bar is equal to summation y i by n and beta 1 hat is equal to summation x i minus x bar into y i by summation x i minus x bar whole square. Now, this covariance is see y i's are independent. So, the covariance between y i and y j is equal to 0 when i is not equal to j. So, this is nothing but the summation of x i minus x bar covariance between y i and y i. So, which is nothing but the variance of y i by summation x i minus x bar whole square and 1 n here. So, this is equal to summation x i minus x bar variance of y i is equal to sigma square by n into summation x i minus x bar whole square and this quantity see summation x i minus x bar is equal to 0. That is why the numerator is 0. So, this is equal to sigma square into summation x i minus x bar. This quantity is 0 always by n into summation x i minus x bar whole square. So, this is nothing this is equal to 0 because of the fact that summation x i minus x bar is always equal to 0. So, we proved that. So, we proved that the covariance is equal to 0. That means the variance of beta naught hat is equal to sigma square by n plus x bar square and the variance into variance of beta 1 hat which is equal to sigma square by s x x which we just proved. So, this is equal to sigma square by into 1 by n plus x bar square by s x x. So, we found. So, we proved that both beta naught and beta 1 they are unbiased estimator of sigma square and also we proved that the variance of beta naught hat is equal to sigma square 1 by n plus x bar square by s x x and the variance of beta 1 hat is equal to sigma square by s x x. So, now see both the variance formula the variance for beta naught hat and beta 1 hat both involve sigma square. So, but we do not know what is the value of sigma square. So, sigma square must be replaced by its estimator. So, we need to estimate the value of sigma square. So, next we talk about the estimation of sigma square how to how to estimate sigma square estimation of sigma square. Well, the estimation of sigma square is obtained from the residual sum of square s s residual and finally, we will be proving that expected value of s s residual by n minus 2 this thing is equal to sigma square. That means, the sum of residual sorry residual sum of square by n minus 2 is an unbiased estimator of sigma square and we will be we can compute the value of residual sum of square given a set of observations and we know n. So, using this formula we can we can estimate the value of the population variance sigma square. Well, so we need to prove this one s s residual is equal to summation e i square that we know and which is equal to the difference and the ith residual is the difference between ith observation and the estimated value y i hat. Well, so this is equal to this is equal to square here this is equal to summation y i minus what is y i hat is equal to beta naught hat minus beta 1 hat x i whole square. So, what we do here is that we know that beta naught hat is equal to just trying to find a convenient form for s s residual. So, beta naught hat is equal to y bar minus beta 1 hat x bar right. So, we will we will just plug this value here this is equal to summation y i minus y bar minus beta 1 hat x i plus beta 1 hat x bar whole square. So, this is equal to summation y i minus y bar minus beta 1 hat x i minus x bar whole square. Now, this can be written as summation y i minus y bar whole square minus let me write plus beta 1 hat square summation x i minus x bar whole square minus 2 times beta 1 hat summation x i minus x bar into into y i minus y bar right. So, this can be written in the form s y y plus beta 1 hat square s x x minus twice beta 1 hat s x y just notation. Now, see we know that beta 1 hat is equal to s x y by s x x right. So, what I will do is that I will replace this s x x x y by beta 1 hat into s x x. So, this is equal to s y y minus beta 1 hat s x x minus twice beta 1 hat. Now, replace this one by beta 1 hat s x x. So, this becomes s y y plus beta 1 hat s x x minus 2 times this is square here square here beta 1 hat square s x x which is equal to s x sorry s y y minus beta 1 hat s x x. So, this is the convenient form of s s residual and we are going to use this one in the next class to prove that because we need to find the expected value of this one to prove that expectation of s s to prove that expectation of s s residual by n minus 2 equal to sigma square. Well, so we will continue in the in the next class. Thank you.