 So, today we will be solving some problem from simple linear regression model that is the you know first topic we talked about. And here is one problem from simple linear regression model, a study was made on the effect of temperature on the yield of the yield of a chemical process the following data were collected in coded form. So, this is the x stands for the temperature and y is the yield of chemical process. So, y is the response variable and x is regressor variable and we want to fit a simple linear regression model. So, we are given 11 observations here. So, the first question is this is quite straight forward the first question is assuming a model like y equal to beta naught plus beta 1 x plus epsilon. What are the least square estimates of the regression coefficient beta naught and beta 1 and what is the fitted equation? We have solved similar problems in the while you know while we are talking about the simple linear regression model. And then the second question is you know construct the ANOVA table and test the hypothesis that you know beta 1 equal to 0 with the level of significance 0.05. And then what are the confidence limit for beta 1 and the fourth question is what are the confidence limit for the true mean value of y when x equal to 3? Let me start with the first one. So, we are given these observations x i y i for i equal to 1 to 11 and first we will be fitting a simple linear regression model using the least square technique. So, what we are given is that we are given x i and y i for i equal to 1 to 11 and we want to fit a model like y equal to or y i equal to beta naught plus beta 1 x i plus epsilon i. So, we know that this beta naught and beta 1 they are obtained by minimizing the least square function that is s which is equal to y i minus beta naught hat minus beta 1 hat x i. So, this is the ith residual and by minimizing this one for i equal to 1 to 11 we get the least square estimate of the regression coefficient beta naught and beta 1. So, beta 1 hat we know that this is equal to s x y please refer my first topic simple linear regression. So, s x x. So, this can be also written as summation x i y i minus n x bar y bar. So, here n is equal to 11 by sum over x i square minus n x bar square. So, you are given x i y i for i equal to 1 to n. So, what you do is that best thing is that you compute sum over x i summation y i summation x i square summation y i square and also the product x i y i then you are done. So, you can compute all these things for the given observations and then you can check that this one is equal to 158 by 158 by 158 by 110 which is equal to 1.44 and beta naught hat is equal to y bar minus beta 1 hat x bar and you can check that this one is 102 by 11 that is 9.27. So, we are done with the first problem. So, the fitted equation is y i hat is equal to 9.27 plus 1.44 x i. So, this is the fitted model for the given problem. So, the next problem is it says that construct the ANOVA table and then test the hypothesis that H naught that beta 1 equal to 0 at level of significance 0.05. So, now construct the ANOVA table. So, source of variation degree of freedom sum of square m s m s and the f statistics. So, source total variation that is s s t. So, s s t is equal to summation y i minus y bar square minus y bar minus y bar i is from 1 to 11 and you can check that this is equal to 248.18. So, the s s total is 248.18 now it has what is the degree of freedom? Degree of freedom is 10 because you know that this y i minus y bar they satisfy a constant like summation y i minus y bar is equal to 0. So, there is one constant here. So, that is why 1 degree of freedom less here. Now, you can compute s s regression s s regression is beta 1 hat square s x x and we know that beta 1 hat is 158 by 110. So, you can compute s s regression is beta 1 hat square s x x and s x x is 110. So, this is equal to 226.94. So, this is regression here and the s s regression is 226.94 and the degree of freedom here. See s s regression is also equal to 226.94 e i square from i equal to 1 to 11. So, this is another way to compute you have the fitted model you know y i hat you have the original observation y i. So, you can compute e i for i is for i equal to 1 to 11, but you do not have the freedom you do not have the freedom of choosing all the e i's independently because there are two constraints on e i. So, you can choose nine residuals I mean you have the freedom of choosing nine residuals and remaining two have to be chosen in such a way that those two restrictions are satisfied. So, the degree of freedom for s s regression is equal to 9 and then the remaining part the part which remain unexplained the part of variability that is s s residual and that has degree of freedom 1 and the s s is obtained by s s total minus s s regression that is 22.23. And the m s value is sorry I said the residual degree of freedom is 9. So, this is the residual degree of freedom and the regression degree of freedom is equal to 1. So, the m s residual is s s residual by the degree of freedom that is 2.36 and the m s regression is s s regression by degree of freedom that is 226.94. So, the f statistics is 2.26. So, this is the m s regression by m s residual which is equal to 96.17. Now you know that we can test this hypothesis say H naught that is beta naught equal to 0 against H 1 that beta 1 is not equal to 0 using this f statistics and the observed value of f is equal to 96.17 and this has degree of freedom 1 9. So, you compute the you check the tabulated value of f 0.0519. So, that is equal to 5.12. So, f is the observed f is greater than the tabulated f. So, what the conclusion is that we reject H naught that is beta 1 is equal to 0 is rejected that means there is a linear relationship between y and x. So, the next problem is what are the confidence limit for beta 1. So, we have a point estimation for beta 1 now we will find the confidence limit for beta 1. So, before doing that may be you know already, but I want to say the another way to test this hypothesis that H naught beta 1 equal to 0 against H 1 beta 1 not equal to 0. This can be tested also using the t statistic that is t equal to beta 1 hat by m s residual by s x x. I hope you know all these things. So, this is the t statistic under the number null hypothesis that beta 1 is equal to 0 and this follows t distribution with the degree of freedom n minus 2. So, here it is 9 and then you can check that this t value is beta 1 hat is 1.44 and m s residual is 2.36 and s x x is 0.34. So, this is 1 1 naught and you can check that this value is 9.83 and now you look at the tabulated value of t that is t 0 5 degree of freedom 9 that is equal to 1.833. So, again the I mean of course, you will get the same result whether you whether you use a f f statistic for testing the hypothesis or you use the t statistic for testing the hypothesis result will be the same and also you know in fact, f is equal to t square under the null hypothesis. So, here again the observed value is greater than the tabulated value. So, we reject H naught that beta 1 is equal to 0. So, this is 1. So, next we will go for the third problem what are the confidence limit for beta 1 at 0.05 level of significance. So, for this one question is what are the confidence limits at alpha equal to 0.05 for beta 1. So, what we know is that we know that this beta 1 hat beta 1 hat which is a linear combination of y i's and y i's follow normal distribution. So, linear combination of normal distribution is again normal. So, this follows normal distribution with parameter with mean beta 1 and variance sigma square is x x. So, I can write this 1 as again beta 1 hat minus beta 1 by sigma square by s x x this follows normal 0 1. But, what happen is that the sigma square the population variance is usually unknown. So, we need to estimate this one we estimate this one by m s residual and once you replace the sigma square by m s residual this follows t distribution. So, beta 1 hat minus beta 1 by m s residual s x x this follows t distribution with degree of freedom n minus 2. And from here we can say that beta 1 hat minus beta 1 by m s residual s x x this less than equal to t alpha by 2 n minus 2 and greater than t alpha by 2 n minus 2 degree of freedom. Of course, minus this has probability this has probability 1 minus alpha that is 0.95 and from here we get the 95 percent confidence limit for beta 1. So, this can be written as. So, the beta 1 from here beta 1 less than equal to beta 1 hat plus t alpha by 2 n minus 2 and of course, multiplied by this thing m s residual by s x x and here we get the beta 1 hat minus t alpha by 2 degree of freedom n minus 2 m s residual by s x x again this has probability 1 minus alpha. So, that is this is the lower limit for beta 1 and this is the upper limit for beta 1. Now, we know everything we know what is the beta 1 hat we can find this tabulated value this is t 0.025 because alpha is 0.05 and we know all this values. So, we can check that finally, this beta 1 is. So, let me put beta 1 hat here. So, this is 1.44 and the t value is 2.263 and the standard error of beta 1 hat is 0.146 and similarly, here 1.44 minus 2.263 into 0.146 and the t value is 2.263 and the standard error of beta 1 hat is 0.146 and similarly, here 1.44 minus 2.263 into 0.146 and the t value is 0.146 and here is the limit for beta 1 it is 1.771.11. So, till now you know these problems we have already discussed in the module I think even 4 also it says that the fourth problem is that what are the confidence limit for the true mean value of y when x is equal to 3. So, what does this mean we have to find the confidence limit for the true mean value that is mean of y or expected value of the response variable at x equal to 3 and in the first module or module in simple linear regression model we denoted this one by expectation of y given x equal to 3. I mean maybe we should not use this notation because x is not a random variable right. So, but both are same this is what I want to say here. So, we have to find the confidence interval for this mean value at x equal to 3 well. So, how to do that? So, what we want is that we want the confidence limit for e y at x equal to 3 well. Let me write this thing instead of x 3 I will write that we are looking for say 95 percent confidence interval for mean value of y at say x equal to x naught. So, x naught is nothing but 3 that I will plug at the end let me solve this in general for x naught. So, this is nothing but beta naught plus beta 1 x naught right because we consider the model y equal to beta naught plus beta 1 x plus epsilon. So, the expected value of y at x naught is this one because expectation of epsilon is equal to 0. And first what we will do is we will find an unbiased estimator for this one and unbiased estimator of this one this at x equal to x naught. That means, unbiased estimator of beta naught plus beta 1 x naught is nothing but beta naught hat plus beta 1 hat x naught because beta naught hat is an unbiased estimator of beta naught and beta 1 hat is an unbiased estimator of beta 1. So, well so this one. So, the unbiased estimator beta naught hat plus beta 1 hat x naught this is a point estimator for the expected mean at x equal to x naught and we are looking for a confidence interval for this one. So, this follows normal distribution with mean beta naught plus beta 1 x naught and you can check that this has variance sigma square 1 by n plus x naught minus x bar whole square by s x naught minus x bar whole square by s x. So, of course, then this minus the mean by square root of this follows normal 0 1 and if you replace this sigma square by m s residual then that follows t distribution. So, let me write that only. So, what I can do now that beta naught hat plus beta 1 hat x naught minus the mean beta naught plus beta 1 x naught and we are looking for a confidence interval for this one. This by m s residual 1 by n plus x naught minus x bar whole square s x x square root this follows t distribution with the degree of freedom n minus 2 right and from here you know now and of course, you know it. So, the whole thing let me call it say a. So, a less than equal to t alpha by 2 n minus 2 minus t alpha by 2 n minus 2 this has probability 1 minus alpha right where a is nothing but this variable. So, from here we will get a confidence interval for beta naught plus beta 1 x naught which is the mean response at x at the point x equal to x naught. So, what we will get from there is that beta naught plus beta 1 x naught is less than beta naught plus beta 1 hat x naught plus t alpha by 2 n minus 2 and then the standard error of this one that is m s residual 1 by n plus x naught minus x bar whole square s x x. And similarly, here the same thing beta naught hat plus beta 1 hat x naught minus this one that is t alpha by 2 n minus 2 this is quite straight forward things m s residual 1 by n plus x naught minus x bar whole square by s x x. So, this will go here well. So, you are done. So, you have to you know everything here whatever you need and finally, you can check that or should I put them. So, we know that this one is this one is let me do for the upper limit that is 9.27 this part and then you know this is n equal to 11. So, it is 9 degree of freedom and alpha is 0.05. So, alpha by 2 is 0.025. So, that is 2.262 and then you have here the m s residual is 2.7 this part and then 3.6 and n is 1 n is 11 plus x naught is 3 and x bar is 0 here. So, 3 by 3 square by s x x that is 110. So, this one is the upper limit. So, this bound for beta naught plus beta 1 x naught and you can check that this one is 9.27 plus I think I said something wrong 9.27 is beta naught only. So, plus beta 1 into x naught that is beta 1 is 1.44 into x naught that is 3. So, this whole thing is 1.44 into x naught going to be equal to you can check that this is nothing but 15.03 is the upper limit beta 1 x naught and the lower limit you can check that is 12.15. So, we found the confidence limit for mean response at the point x equal to 3. Now, this might be little you know we did not try this one before what it says is that what are the confidence limit at 0.05 level of significance for the difference between the true mean value of y when x 1 equal to 3. So, that is nothing but mean value of y at x 1 equal to 3 and the mean value of y that is E y at say x 2 equal to minus 2. So, this problem says what is the difference between this not what is the what are the confidence limit for this one. So, if I call this one say z 1 and let me call this as z 2 what are the confidence limit for z 1 minus z 2. Now, just now we know what is the unbiased estimator for this one. So, the unbiased estimator for z 1 for simplification the unbiased estimator for z 1 is nothing but beta naught hat plus beta 1 hat x naught sorry here it is x 1. So, 3 and the unbiased estimator for z 2 call it z 2 hat that is equal to beta naught hat plus beta 1 hat minus 2 into minus 2 and thus the unbiased estimator for z 1 minus z 2 is z 1 hat minus 1 hat minus 1 hat minus 1 hat z 2 hat which is equal to beta naught hat plus beta 1 hat 3 minus beta naught hat plus beta 1 hat minus 2 which is nothing but beta 1 hat into 5 and we know that beta 1 hat is equal to 1.44 into 5 which is equal to 7.20 well. So, what we did is that we found a point estimation for z 1 minus z 2 and the point estimation is 7.21. Now, we have to find the confidence interval for z 1 minus z 2. So, what we have to do is that we have to find a distribution for z 1 hat minus z 2 hat and for that we need to compute the variance of variance of z 1 hat minus z 2 hat is nothing but variance of 5 beta 1 hat because that is what we got here. So, z 1 hat minus z 2 hat is nothing but variance of 5 beta 1 hat right because that is what we got here. So, z 1 hat minus z 2 hat is 5 beta 1 hat right. So, this one is nothing but 25 into variance of beta 1 hat right and this is equal to 25 into sigma square by s x x which is equal to 25 into beta 1 hat. Sigma square by 1 1 0 ok. So, z 1 hat minus z 2 hat which is an unbiased estimator for z 1 minus z 2 that follows normal distribution with mean z 1 minus z 2 and variance 25 sigma square by 1 u 10 and then again you know this minus this by square root of this follows standard normal and if you replace this sigma square by m s residual then it is t distribution. So, z 1 minus z 2 minus z 1 minus z 2 by square root of 25 m s residual by 1 10 by 1 1 minus this follows t distribution with degree of freedom n minus 2 right and from here I can write that z 1 minus z 2 is then less than equal to z 1 minus z 1 hat minus z 2 hat plus t alpha by 2 9 degree of freedom and here it is 25 into m s residual is 2.36 and s s x x x is 1 1 0 ok. And similarly, z 1 minus z 2 hat plus t alpha by 2 9 degree of freedom and here also it is z 1 hat minus z 2 hat minus t alpha by 2 9 degree of freedom by the same thing 25 into 2.36 by 1 1 0. And finally, you know we know that this one is 7.20 right. So, finally, it is the confidence interval for z 1 minus z 2 is 8.86 and the lower limit is 5.54. So, this is how you know we find the confidence interval for z 1 minus z 2 is the confidence interval for the mean difference at two different points ok. Well, so the first problem was you know quite easy and this sort of problem you already solved in the first module or in the first topic. Now, we will go for the second problem and here you know I recommend you you know do not look at the solution first you try independently and then you know if you can solve it independently that means you know you have understood the things. Well, here is the problem consider the simple linear regression model y equal to beta naught plus beta 1 x plus epsilon where the intercept is known. So, this is something new. So, for this linear model the beta naught is already known then what you have to do is that find the least square estimator of the slope beta 1 for this model. This is the first problem then what is the variance of the slope beta 1 hat for the least square estimator found in part 1. So, you find you you find a least square estimator of beta 1 say that is beta 1 hat then find the variance of beta 1 hat and the final problem is that it says that find the confidence interval for beta 1 and is this interval narrower then the estimator for the case where both slope and intercept are unknown. You try to solve independently and then see my solution and here is the solution the first part is you are given a model y i you have to fit this model between beta naught plus beta 1 x i plus epsilon i and epsilon i satisfy all this assumption assume that normal 0 sigma square. The only thing is that here beta naught is known. So, only have to find the least square estimator for beta 1 hat. So, how do we find that we will compute the least square function s which is equal to e i square what is e i square e i square is y i minus y i hat square which is equal to y i minus y i hat is beta naught minus beta 1 hat x i. So, see here I did not put hat because this is the parameter we do not need to estimate we have to estimate only beta 1 and then this is the least square function and then you find beta 1 hat in such a way that this is minimum. So, we have to differentiate this least square function with respect to beta 1 this is equal to 0 implies that summation y i minus beta naught minus beta 1 hat x i minus beta 1 hat x i minus beta 1 hat x i minus into x i is equal to 0 and from here I get that y i minus beta naught into x i equal to beta 1 hat summation x i square. So, this implies that my least square estimator for beta 1 is equal to summation y i minus beta naught x i by summation x i square. So, this is the we are done with the first part this is the least square estimator for beta 1 hat when beta naught is known. The second problem is you find the variance of beta 1 hat the second part is you know find the variance of beta 1 hat and what is the beta 1 hat beta 1 hat we just found that this is equal to summation y i minus beta naught into x i by x i square right. So, the variance of this one. So, here only random variable is y i right. So, this one is equal to variance of summation y i minus beta naught into x i variance of this y i minus beta naught into x i variance of this by summation x i square whole square. So, this one is equal to they are all independent y i's are independent. So, this is x i square variance of y i right variance of y i minus beta naught is nothing but variance of y i because beta naught is a constant for i equal to 1 to n by summation x i square whole square and variance of y i we know that variance of y i is equal to sigma square right. So, we can put now the sigma square into x i square by summation x i square whole square. So, this is equal to sigma square by summation x i square right. So, this is this is the variance of beta 1 hat and then the third problem was you find a confidence interval for summation for beta 1 hat. So, the last part was find a confidence interval for beta 1 well. So, we know the variance of beta 1 hat. Let me check whether beta 1 hat is unbiased. So, you find the expectation of beta 1 hat. So, that is nothing but the expectation of summation y i minus beta naught x i sum over x i square and this one is equal to y i minus beta naught is nothing but beta 1 x i right from the model you get into x i. So, you put the expectation inside right by summation of x i square well let me make it clear. So, you can put this expectation and bring this expectation inside. So, y i minus beta naught y i is equal to beta naught plus beta 1 x i plus epsilon right. So, expectation of y i minus beta naught is equal to expectation of beta 1 x i plus epsilon i and expectation of epsilon is equal to 0. So, expectation of beta 1 x i which is beta 1 x i. So, this one is beta 1 into summation x i square by summation x i square that is nothing but beta 1. So, what we found is that expectation of beta 1 hat is equal to beta 1. So, the beta 1 hat is unbiased right and we also know the variance of variance of beta 1. So, beta 1 hat follows normal distribution with mean beta 1 and variance sigma square by summation x i square and then the usual technique this minus this by square root of this follows standard normal and then beta 1 hat minus beta 1 by if you replace this sigma by m s residual by x i square then what you get is that this follows t distribution with degree of freedom n minus 1 here. This is the residual degree of freedom and you should understand that here the model is y naught equal to beta naught plus beta 1 x i plus epsilon i. So, y i is this and here while minimizing this least square function s which is equal to e i square there we differentiated this one with respect to beta 1 only. So, there is only one restriction on epsilon i and so you have the freedom of choosing n minus 1 e i independently and then the last one has to be chosen in such a way that restriction is satisfied. So, that is why the s s residual here has degree of freedom n minus 1 not n minus 2. So, this is one point right and from here you can check that we can write that beta 1 hat minus beta 1 by m s residual by summation x i square this less than equal to t alpha by 2 n minus 1 t alpha by 2 n minus 1 minus this has probability 1 minus alpha. And then finally, what we have is that we have the interval for beta 1 that is beta 1 hat plus m s residual summation x i square t alpha by 2 n minus 1 t alpha by 2 n minus 1 and the lower bound is beta 1 hat minus m s residual by summation x i square t alpha by 2 n minus 1 this is the lower bound for this. Now, what happened in the usual case when both beta naught and beta 1 are unknown we get this confidence interval for beta 1 there we get it is beta 1 hat plus m s residual by here you will get s x x into t alpha by 2 and the degree of freedom is n minus 2. So, this is the upper limit and the lower limit is similarly beta 1 hat minus m s residual by s x x into t alpha by 2 n minus 2. Now, the question was is this interval narrower then the estimator for the case where both slope and intercept are unknown. So, this is the case when both intercept and beta 1 they are unknown and this is the case when beta naught the intercept is known. Now, whether this interval is narrow then this one how to check that see s x x is equal to summation x i square minus n x bar square. So, which implies that s x x is smaller than. So, this is the case when beta naught the intercept summation x i square that means this one is larger than this one is larger than this one. So, from here I can say that m s residual by summation x i square is less than equal to m s residual by s x x. So, this one is larger than this one and again from the t table you can check that you know t value for this one t alpha by 2 n minus 2 is less than or equal larger than because the here lower degree of freedom is larger than t alpha by sorry t alpha by 2 n minus 1. So, both this one is larger than this one this one is larger than this one right I should not write square root that. So, that is why of course that this one this interval is narrower than this one the final answer is yes this interval is narrower then the interval for the case where both beta naught and beta 1 beta 1 and beta naught are unknown. So, now we have to stop. So, tomorrow again you know in the next class we will be talking about some more problems on regression. Thank you.