 regression, there the multiple linear regression model has been fitted and the significance and the usefulness of the multiple linear regression model has been tested using the global test and also using the partial test we have tested the significance of say for example, x 1 in the presence of x 2 basically the test statistic we have used there is t statistic well so this can be done also using the extra sum of square technique so I am going to talk you know about how to test the significance of one regressor variable in the presence of other regressor variables other regressor variables using extra sum of square technique well well so this one is you know this one is usually used to test for several parameters being 0 but here I am using this technique to test the simple hypothesis like H naught ß 2 equal to 0 against the alternative hypothesis H 1 which says that ß 2 not equal to 0 okay so this one can be I mean we already we have tested this hypothesis using the t statistic but now we will be using you know extra sum sum of square technique to test this hypothesis well so this hypothesis can also be written as you know H naught the null hypothesis y equal to ß 1 plus sorry ß naught plus ß 1 x 1 plus epsilon against the alternative hypothesis that is y equal to ß naught plus ß 1 x 1 plus ß 2 x 2 plus epsilon okay so whether it is enough to consider this model or it is necessary to consider the full model so this is basically the full model and the null hypothesis says that it is okay to go for the restricted model well so what we do in the extra sum of square technique is that we compute SS regression for both the model like for first we compute the SS regression for the full model and also we compute the value of the SS regression for the restricted model so here is the restricted model so what is the SS regression value when there is only one regressor in the model and what is the SS regression value when there are two regressor variables in the model so for these two values I will recall my previous class you know this one is my ANOVA table in my previous class for the full model this one is for the full model please refer my last class so here the SS regression value is 122 and the SS residual value is 68 so I will copy these two values SS regression is equal to 122 and also you know SS residual for the full model is equal to 68 okay now I will refer my previous lecture for the restricted model so here is the restricted model I mean we have fitted this model using only one regressor and here is the ANOVA table for my restricted model that model here is equal to is beta sorry y equal to beta naught plus beta 1 x 1 plus epsilon and here is the SS regression 116 well so SS regression for the restricted model is equal to 116 now the difference you know we know that SS regression increases as the number of regressor variables increases so this is the SS regression for the full model and this is the SS regression for the restricted model now the difference SS regression minus for the full model and SS regression for the restricted model is equal to 6 okay so this one is you know this is this one is called the extra sum of square due to beta 2 or you can say that this is the extra sum of square sum of square means it is extra regression sum of square due to the regressor x 2 because this one this SS regression is involving both x 1 and x 2 and this SS regression is involving only x 1 so the difference is the SS regression due to x 2 now the F statistic for this extra sum of square technique is you know SS regression for the full model minus SS regression for the restricted model by the degree of freedom here the degree of freedom of this one is only 1 because here beta 2 is not a vector it is just only one regression coefficient and it is corresponds to x 2 so F is this by SS residual for the full model by the degree of freedom the degree of freedom is equal to 8 if you can you can refer my previous class well so this is equal to 6 by this is nothing but MS residual which is equal to 8.5 which is equal to 0.7 okay and we know that this F statistic follows F distribution with degree of freedom one and 8. So, what we do is that we compare the observed value of F which is equal to 0.7 with the tabulated value of F 0.0518 the value of this one is 5.8. So, the observed value is less than the tabulated value that means the conclusion is that the hypothesis H naught is accepted. So, the meaning of this one is that you know that we accept the hypothesis that beta 2 equal to 0 the meaning of this one is the regressor variable x 2 is not significant in the presence of x 1 in the model. So, basically we got the same result I mean we concluded the same thing in using the t test also. So, this is another way to you know this is another way to do the same testing this is you know using the extra sum of square technique well also just I want to mention here is that the if you use the t statistic then the t statistic value is equal to minus 0.83. So, here you can check that t square value is equal to is almost you know same equal to F and this is you know in general this is true. So, whether you go whether you go for the t test or you use the extra sum of square method to test this hypothesis you will be getting the same result. Well, next basically the content of today's lecture is you know today's lecture is you know we will be talking about confidence interval on regression coefficients. Also the confidence interval on mean response and once the model has been fitted you know it is very important you know one important issue is to predict prediction of new observation for a given value of regressor variable well. So, next we talk about confidence interval intervals on regression coefficients. So, here the regression coefficient is beta which is a vector beta naught beta 1 up to beta k minus 1 right and we want to find you know the confidence interval for beta i for any i well what we know is that to find the confidence interval first we need to have the point estimator of beta we know that beta hat is equal to x prime x inverse x prime y this is an unbiased estimator of beta. So, basically this one is an is a point estimator and also we know that the variance of beta hat is equal to sigma square x prime x inverse. Now, from here we can say that beta i hat so I am talking about the i th regression coefficient this follows normal distribution with mean beta i and the variance sigma square x prime x inverse i i n. So, this one is you know this notation we have used several time this is the i i th element in x prime x inverse. Now, from here I can write beta i hat minus beta i root over I am replacing this sigma square by m s residual x prime x inverse i i this random variable follows t distribution with the degree of freedom n minus k because we have k minus 1 regressor well then obviously you know we can say that the beta i hat minus beta i by this quantity m s residual x prime x inverse i i th element of this the absolute value of this one is less than or equal to t alpha by 2 minus beta n minus k this has probability equal to 1 minus alpha. So, if you choose you know alpha equal to 0.05 then this random variable is absolute value of this random variable is less than t alpha by 2 n minus k is 0.95. So, from here we get 100 into 1 minus alpha percent confidence interval for the parameter for the parameter b i is you know you can get b i is in between b i hat plus t alpha by 2 n minus k m s residual x inverse i i and similarly the lower bound is beta i hat minus t alpha by 2 n minus k into m s residual x prime x inverse i i. So, this is the absolute value of this random variable the 95 percent confidence interval for the ith regressor when alpha is equal to 0.05 similarly you can get similar confidence interval for the other regressor coefficient from for the other coefficients also. So, next we will be talking about confidence interval on mean response at a particular point say x naught is equal to 1 x naught 1 x naught 2 x naught k minus 1. So, what we want is that we want the expected response value at this point. So, these are the this is the value of first regressor second regressor and the k th regressor. So, at this point we are looking for the expected response value. So, what we want to estimate first is that we want to estimate the expected. So, we need we are looking for the confidence interval for this expected value or mean response at the point. So, the usual technique you know to find the confidence interval for this one first you have to look for the point estimation of this one. Well, what is this quantity this quantity is nothing but or this is nothing but x naught beta. So, x naught is a 1 cross k vector and beta is a k cross 1 vector. Well, we know that an unbiased estimator unbiased estimator of this expected response at the point x naught is. So, the unbiased estimator of this one is nothing but x naught beta hat. So, this one is an unbiased estimator of this quantity because beta hat is an unbiased estimator of beta. So, you can prove that you know expected value of x naught beta hat is equal to x naught expectation of beta hat which is equal to x naught beta and the variance of next we compute the variance of the unbiased estimator y naught hat which is equal to the variance of x naught beta hat right. Now, this variance is equal to x naught the variance of beta hat into x naught prime well. So, this one is nothing but we know the variance of beta hat is equal to sigma square x prime x inverse well. So, from here I can say see my unbiased estimator y naught hat has expectation this and variance this. So, I can say that y naught hat minus y naught hat minus x naught beta you know this by x naught m s residual I am just replacing sigma square by m s residual x prime x inverse x naught prime this quantity or this random variable it follows t distribution with degree of freedom n minus k right. So, from here you know using this I can give now the 100 into 1 minus alpha percent confidence interval for the expected response this is the expected response. So, I can get the confidence interval for this expected response now. So, therefore, you know 100 into 1 minus alpha percent confidence interval on mean response at the point x naught is equal to x naught minus x naught is x naught beta hat which is nothing but y naught hat plus t alpha by 2 n minus k into m s residual x naught x prime x inverse x naught prime. So, this is the random variable upper bound of the interval and the lower bound is obtained by just replacing the plus sign by the by minus. So, this one is you know basically the confidence interval for the expected response at the point x equal to x naught. Well, so next we will be talking about prediction of new observation well. So, this is a very important aspect because you know once you have the fitted model that the fitted model if it is a significant one then that model can be used the regression model can be used to predict new observations corresponds to a particular value of the response variables well. So, here what you want is that we want to predict the value new observation at the point at say x naught equal to 1 x naught 1 x naught k minus 1 well. So, this one is bit you know for simple linear regression also we had the same sort of problem the difference between the expected response and the new observation at the point x naught is you know expected response at the point expected response at the point x naught is nothing but x naught beta. But the observation here what we are looking for is that you know we are trying to predict the future observation why not at the point x equal to x naught I mean when x naught means it is for the given values of the regression variable well. So, here we are trying to estimate y naught this y naught is according to the model this y naught is nothing but x naught sorry x naught beta plus epsilon. So, in the previous case we tried to estimate this one this expected response and here we are trying to predict the value of this one. So, this is the difference but the starting point will be same we will start with the point estimation or point estimator of this one. So, a point estimator future observation why not at the point x naught is again we call it y naught hat which is equal to x naught beta hat well. So, you see the difference here you know we are starting with the same point estimator. So, this point point estimator we have used to estimate the expected response well. But now this point estimator is not an unbiased estimator of this thing but because we have the epsilon term here. So, that is why we define a new random variable psi which is equal to y naught hat the same strategy as the simple linear regression model y naught hat minus y naught. And the expected value of this random variable is equal to 0 this is not difficult to check because expectation of this is equal to 0. And the variance of this one you can check that variance of this one is variance of psi is equal to the variance of y naught hat minus y naught which is equal to sigma square 1 plus x naught x prime x inverse x prime x naught prime. Let me explain see the first term is sigma square which is basically the variance of y naught and the second term is the variance of y naught hat we know why it is so. And these two are independent because this is a new observation and this y naught hat you know it consists of the previous observations the given observations y 1 y 2 y n and this one is an independent observation future observation. So, from here we can say that y naught hat minus y naught by m s residual 1 plus y naught minus y naught by m s residual 1 plus y naught x naught x prime x inverse x naught prime. So, I just replace sigma square by m s residual so this random variable follows t distribution with the degree of freedom n minus k right. And from here from the distribution of this random variable we can get the predicate prediction interval we call it prediction the same thing confidence interval or prediction interval for y naught. So, thus 100 into 1 minus alpha percent prediction interval for y naught is equal to x naught beta hat. So, this is nothing but y naught hat plus minus t alpha by 2 n minus k into m s residual 1 plus x naught x prime x inverse x naught prime right. Well so this is how we get the prediction interval for a future observation. Now just give one example to illustrate this confidence interval or prediction interval. Let me consider the same example as in my last class and here I want to consider the problem that you know you find the variance of the prediction value of y for the point x 1 equal to 3 and x 2 equal to 5. So, if you can recall my last example there we had two regressors x 1 x 2 and y. Now what this problem says is that you know you find the variance of predicted value of y for when x 1 equal to 3 and x 2 equal to 5. So, here my x naught is what I said x naught is you know 1 x naught 1 x naught 2. So, this one is nothing but 1 x naught 1 x naught 3 5 this is my x naught right. Now what is the point estimation of estimator of y the point estimator of y of y call it y naught hat which is nothing but x naught beta hat and what we want here is that we want a variance of this point estimator. So, the variance of y naught hat is equal to x naught the variance of beta hat which is equal to sigma square x prime x inverse x naught prime right. Now to estimate you know we cannot compute the variance because sigma square is not known. So, we have to replace this sigma square by m s residual. So, sigma square is equal to 8.5 because m s residual value is equal to 8.5. Now my x naught is you know this is a scalar quantity. Now my x naught is 1 3 5 and you know that x prime x is x prime x inverse which is equal to 4.37 minus 4.37 0.849 please refer my last lecture minus 0.4086 0.169 0.08 0.0422 into 1 3 5. Now just I am giving this example. So, that you know just to illustrate whatever the theory I just explained and from here you know it is not difficult to check that this is you know this is 1 cross 3 this 1 is 3 cross 3 and 3 cross 1. So, ultimately the value of this 1 is going to be 1.4. So, this is the estimated variance of the predicted value of O I at this point and once you have this variance you can you can find the confidence interval or prediction interval very easily. So, that is all regarding the confidence interval or prediction interval. So, this is in multiple linear regression. Next I have some time. So, I want to solve one problem from simple linear regression. This will I hope this will help you to understand you know more on on degree of freedom how to calculate degree of freedom. So, consider this problem from the simple linear regression from the simple linear regression well well. So, the model is y equal to beta naught plus beta 1 x plus epsilon as usual here we say that beta naught is. So, this is the only difference. So, you really you know do not need to estimate beta naught because you know beta naught value is given. So, only thing you need to do is that you know you need to estimate the value of the slope that is you know beta 1. So, the first problem it says that find least least square estimator of beta 1 for this you know. So, you need to understand that of course, you know that you know to estimate or to find the least square estimator of beta 1 you need to I mean that will be obtained by minimizing the SS residual. So, you assume that the fitted model is you know y equal to beta naught plus beta 1 hat x plus epsilon x. So, suppose your fitted model fitted model is y hat which is equal to beta naught plus beta 1 hat x. So, I am not putting c I did not put beta 1 hat because beta sorry beta naught hat because beta naught is known well. So, my E I is equal to y I minus y I hat. So, here it is y I minus beta naught minus beta 1 hat x I this is my ith residual. Now SS residual residual is equal to y I minus beta 1 hat sum of square which is equal to sum E I square 1 to n this is going to be y I minus beta naught plus beta 1 hat x I well. So, the least square estimate of beta 1 can be obtained by minimizing the SS residual and here you have only one unknown parameter so you just differentiate SS residual with respect to beta 1 hat and this equal to 0 will give you the normal equation. So, the normal equation is y I minus beta naught plus beta 1 hat x I which is basically E I into x I equal to 0. So, here you will get only one normal equation because there is only one unknown parameter and solving this normal equation you will get the estimator of beta 1 hat. So, from here we get beta 1 hat is equal to summation y I minus beta naught x I by summation x I square right. So, this is the this is the least square estimator of beta 1 hat. So, I mean what I want to say is that is you know you here you do not need to differentiate with respect to beta naught because because beta naught is is known. So, you do not need to estimate that. So, the next problem is you know find problem b it says the find 100 into 1 minus alpha confidence interval for beta 1 hat. So, to find this one we need to sorry confidence interval for beta 1 it is not beta 1 hat find 100 into 1 minus alpha percent confidence interval for beta 1. So, we know that unbiased we do not know whether it. So, we will start from the least square estimator of beta 1 that is beta 1 hat which is equal to summation y I minus beta naught x I by summation x I square well. So, this is a point estimator of beta 1. Now, we will check what is the expected value of this beta 1 hat beta 1 hat is equal to you put what is y I y I is equal to beta naught plus beta 1 x I plus epsilon. So, if you put this will be replaced by beta 1 x I plus epsilon I into x I by summation x I square and the expected value of this one. So, here you can check that you know the second term is 0 because expected value of epsilon I is equal to 0. So, this is going to be beta 1 into summation x I square by summation x I square. So, it is an unbiased estimator of beta 1 and also you can check that the variance of beta 1 hat is equal to beta sigma square by summation x I square right. So, from this 2 I can write you know beta 1 hat minus beta 1 by if I replace this sigma square by m s residual by m s residual sum over x I square is this follows t distribution with degree of freedom. So, here is the here I want to discuss little bit it is degree of freedom n minus 1 because there are because the degree of freedom of s s residual is n minus 1. Why? s s residual is summation E I square I equal to 1 to n and you have the freedom of choosing n minus 1 E I's and the last one the n th one has to be chosen in such a way there is such that it satisfy the constraint that E I x I equal to 0. So, here you have to note that you know in there is only one constraint on E I. So, that is why you are losing the degree of freedom by 1 and thus the degree of freedom of s s residual is equal to n minus 1. It is not n minus 2 in case of you know usual for the simple linear regression model when both beta naught and beta 1 are unknown the degree of freedom for this one is n minus 2 because here since beta naught is is beta naught is known we are differentiating with respect to only beta 1 and we get one normal equation which is nothing but E I x I equal to 0. So, there is only one constraint on residual. So, that is why the degree of freedom is n minus 1. So, now I can write 100 into 1 minus alpha percent confidence interval for beta 1 is beta 1 hat either just plus minus t alpha by 2 n minus 1 m s residual by summation x I square. So, that is all. Thank you very much.