 the fifth lecture on simple linear regression. The content of today's lecture is you know we will be talking about confidence interval for the regression coefficient beta 1 and then we will be talking about interval estimation of the mean response and finally, we will be talking about prediction of new observations for a given value of regression variable x equal to x naught. So, before I start talking on interval estimation on interval estimation for beta 1, I want to just recall the important parameter I talked about in the last class that coefficient of determination that is r square coefficient of determination which is defined by r square denoted by r square r square is equal to ss regression by ss t. So, this one is basically this r square it measures the proportion of variability in the data or in the response variable that is explained by the model or that is explained by the regressor variable. For example, if we consider the Disney toy example there r square is equal to ss regression was 4.9 and ss t is 6. So, this is equal to 0.82. So, the meaning of this one is that 82 percent of the total variability in the response variable or the total variability in sales amount is explained by the amount of money spent on advertisement. Well, we know that the range for r square is from 0 to 1. We discussed when r square is equal to 1. Let me consider the case r square equal to 0. Well, so r square is equal to 0. Well, so r square can also be written as 1 minus ss residual by ss t. So, this quantity is going to be equal to 0 if ss residual by ss t if this ratio value is equal to 1. That means r square is equal to 0 when ss residual is equal to ss t. So, what is ss residual? ss residual is summation y i minus y bar sorry y i minus y i hat square which is equal to summation y i minus y bar square i is from 1 to n. So, these two quantity are equal when y i hat is equal to y bar. So, basically if the fitted model is y hat equal to y bar then r square value is equal to 0. That means this fitted model it does not depend on the regressor variable. Well, so the situation I mean suppose we have some data and we have a set of observations i y i and the fitted model is this one which is y hat equal to y bar. That means the significance of this one is that x y this way is that there is no relationship between the response variable and the regressor variable. I mean this will happen y hat equal to y bar will happen when there is no relationship between y and x. Also in other way you can think about I mean this will happen when beta 1 is equal to 0. So, this also says I mean r square is r square is equal to s s regressor by s s t and we know that s s regressor is beta 1 hat square s x x by s s t. So, this will be equal to 0 when beta 1 is equal to 0. Well, so next we move to for the confidence interval for beta 1. Well, so our model is y equal to beta naught plus beta 1 x plus epsilon. So, this is called the intercept and this is beta 1 is basically the slope and we know that the least least square estimator of beta 1 is beta 1 hat which is equal to s x y by s x x. So, this estimator is the result of least square estimate and this is in fact more precisely this is called the point estimation of beta 1. So, the concept of interval estimation is that you know instead of giving a point estimate of some population parameter the interval estimation gives an interval such that the probability that the population parameter will lie in that interval with high probability that means may be the with probability 0.95 or 99. So, the technique to get you know the interval estimation is that you find first you find the point estimate or you find the point estimator of the population parameter and then you find the sampling distribution of the point estimator. Well, so here the beta 1 is equal to s x y by s x x we need to find the sampling distribution of beta 1 hat. We know that beta 1 hat is an unbiased estimator of beta 1. So, this is that means beta 1 hat is equal to beta 1 and also we know that the variance of beta 1 hat is equal to sigma square by s x x and also we have proved in previous lecture that you know this beta 1 hat this one is basically it is a linear combination of random variables y i and we have assumed that y i's are normally distributed. So, any linear combination of normal variable also follows normal distribution. So, beta 1 hat we know that beta 1 hat follows beta 1 hat follows normal distribution with mean beta 1 and variance sigma square by s x x right. Well, from here we can say that beta 1 hat minus beta 1 by sigma square s x x root of this follows normal 0 1. Well, but the situation is that you know most of the cases sigma square is not known then we replace sigma square by its unbiased estimator that is m s residual. So, if we replace m s residual if we replace sigma square by m s residual then this random variable beta 1 hat minus beta 1 by m s residual by s x x this follows t distribution with degree of freedom n minus 2. Now, so what we got is that we got that beta 1 hat minus beta 1 by m s residuals by s x x this follows t distribution which is degree of freedom n minus 2 and let me call this equal to t. Now, we need to have a confidence interval for beta 1 well suppose this is the t distribution. Now, we will take two points this point is t alpha by 2 n minus 2 the meaning of this one is that t greater than this one and the probability of this this area I mean this portion is alpha by 2. So, from here and this point is this point is minus t alpha by 2 in minus 2. Now, beta 1 hat minus beta 1 by m s residual by s x x we can say that this t this is basically the t. So, t is in this interval t alpha by 2 n minus 2 greater than t alpha by 2 n minus 2 with probability 1 minus alpha. So, to make this probability high we have to choose alpha accordingly if for example, if if we want to make this probability say 0.95 then then we have to choose alpha equal to 0.05. So, from here we get the confidence interval for for beta 1. So, we say that 100 into 1 minus alpha percent that means the if we choose alpha equal to 0.05 this quantity is going to be 95 percent confidence interval for for beta 1 is obtained from here simply algebra write beta. So, the range of beta 1 is is beta 1 hat plus t alpha by 2 n minus 2 root over m s residual by s x x and the lower bound is beta 1 hat minus t alpha by 2 n minus 2 m s residual by s x x. So, this one is this one is basically you know 95 percent confidence interval for beta 1 the meaning the in other word we can say that the that the population parameter beta 1 which is basically the slope for the simple linear regression model. This will lie in this interval with probability 0.95. Let me explain this one in the toy example in the Disney toy example. So, what we got is that the upper bound was beta 1 hat plus t alpha by 2 n minus 2 root over m s residual by s x x this is the upper bound for beta 1 and the lower bound is beta 1 hat minus t alpha by 2 n minus 2 m s residual by s x x. So, for the Disney toy example beta 1 hat is equal to 0.7 and there we have 5 data points. So, we will choose alpha equal to 0.05 to make you know probability 0.95. So, t 0.0253 is you see the value of this one from from the statistical table. This one is equal to 3.182. So, what we only need to compute this quantity root over of root over of m s residual by s x x. So, what we have to do is we have to compute this is equal to 0.367 s x x is 10 and you know it is not difficult to now check that the beta 1 will lie in the interval 1.3 0.1. And beta 1 will lie in the interval 0.1 to 1.3 with probability this probability is equal to 0.95. So, this is what the interval estimation is and you know instead of giving one estimate of a population parameter here we give an interval and the use of this interval is that we can say that the population parameter will lie in this interval with high probability that is 0.95. So, next we move to interval estimation of mean response that is E y mean response or expected response for given x equal to x naught. Well, once you have the fitted model you know one important application of regression model is to estimate the expected response for a given value of the regressor variable and also the another important problem for the regression model another important application of the regression model is that prediction of new observation corresponds to a given value of response variable given value of regression variable x. So, first we will talk about the estimation of in fact the interval estimation of the expected response or mean response at some for a given value of the regressor variable x. So, here we want to find interval estimation mean response that is or expected response for x equal to x naught. So, this looks like you know like conditional expectation, but basically what I want to mean by this notation is that I want to estimate the expected response value for given x equal to x naught. That means at x naught point of the response of the regressor variable I want to find the expected response. Well, if you recall the model simple linear regression model y equal to beta naught plus beta 1 x plus epsilon. So, the expected response y at the point x equal to x naught this quantity is 0. It is equal to beta naught plus beta 1. So, we want to find an estimator of this quantity beta naught plus beta 1 x not only I mean we are not looking for the point estimation of this expected response, we are looking for an interval estimation of this expected response at the point x equal to x naught. Well, so again you know we have to start from the point estimation of I mean one estimator of this expected response. We know that an unbiased estimator this expected response y given x equal to x naught is let me denote this estimator by this expected response hat equal to beta naught hat plus beta 1 hat so I should put x naught here. We want to find interval estimation of this expected response at the point x equal to x naught. So, this is an unbiased estimator of the expected response. Well, this is an unbiased estimator of the expected response. Well, it is very easy to prove that this is an unbiased estimator because both beta naught and beta 1 they are unbiased estimator of beta naught and beta 1 respectively. Now, we need to find the sampling distribution of this quantity or this random variable I should say. Well, to get that I need to find the variance of this estimator the variance of beta naught hat plus beta 1 hat x naught hat plus beta 1 hat x naught is equal to variance of variance of y bar plus beta 1 hat x naught minus x bar. What I did here is that I just I have replaced we know that beta 1 hat is equal to y bar minus beta 1 hat x bar. So, I have replaced beta naught hat by this quantity. Now, variance of this one is equal to variance of y bar plus variance of beta 1 hat x naught minus x bar plus beta 1 hat x naught minus twice covariance of y bar beta 1 hat x naught minus x bar right. And it is not difficult to prove that this quantity this covariance term is equal to equal to 0. Now, what I want to do is that I want to write down the variance of beta 1 hat plus beta 1 beta naught hat plus beta 1 hat x naught is equal to is equal to basically variance of y bar which is sigma square by n plus variance of this quantity which is which is x naught minus x bar whole square into sigma square by s x x. Well, so I have replaced beta 1 hat x naught finally, the variance of this quantity is equal to sigma square into 1 by n plus x naught minus x bar whole square by s x x right. So, again you know the same argument beta 1 sorry beta naught hat is a linear combination of y i's beta 1 hat is also a linear combination of y i's. So, since the since y i's follows normal distribution. So, you can we can say that beta naught hat plus beta 1 hat x naught which is linear combination of random variables that also follows normal distribution. So, beta naught hat or we can say that the estimator of y given x equal to x naught this estimator follows normal distribution with mean beta naught plus beta 1 x naught and variance sigma square by 1 by n minus x bar whole square. So, this is x naught minus x bar whole square by s x x and from here from here now the sampling see sigma square is is not known. So, we replace sigma square by m s residual. So, what we got finally, is that this estimator y given x equal to x naught minus x bar whole square minus you know this this one is basically response expected response at the point x equal to x naught by by root of by root of you know m s residual. I am just replacing sigma square by m s residual into 1 by n plus x naught by n plus x naught by n plus x naught by minus x bar by s x x this follows t distribution with with degree of freedom n minus 2. So, we want to find a confidence interval for this expected response at the point x equal to x naught. Now, we have an estimator for this one and we we have the sampling distribution this is called the sampling distribution of this estimator sampling distribution of estimator and from here from here we we we get the 95 percent confidence interval for the expected response and that is given by that is given by well let me write 100 into 1 minus alpha percent confidence interval on expected response at the point x equal to x naught is is e of given I am writing just x naught this is in between e of y given x naught estimator plus t alpha by 2 n minus 2 and then x naught by n minus 2 and then x naught and then that and that here you have root of this quantity m s residual into 1 by n plus x naught minus x bar whole square by s x x and similarly the lower bound is y given x naught this quantity minus t alpha by 2 n minus 2 m s residual 1 by n plus x naught minus x bar s x x and similarly the lower bound is y given x naught this quantity minus t alpha by 2 n minus 2 m s residual 1 by n plus x naught minus x bar s x x. So, this is the this is the confidence interval for this one and this confidence interval is is minimum you know this is this confidence interval is is minimum at x equal to x naught and this widens as and this widens as x naught minus x bar s x bar s x bar s x bar minus x bar the absolute value of this one increases well I mean this looks bit abstract let me give one example for this one you know again you consider the consider the toy example Disney toy example and here what you do is that estimate the estimate means say sales amount when advertisement cost is say 4 dollar at the 0.05 level. So, you know you find out all these things well let me just compute the upper bound for this one. The upper bound is e y given x naught estimate of this one plus t alpha by 2 n minus 2 into m s residual 1 by n plus x naught minus x bar s x bar s x bar s x bar s x bar s x bar whole square by s x x. You know this quantity is nothing but beta naught hat plus beta 1 hat x naught. So, we know the value of this one this one is basically minus 0.1 plus beta 1 hat is 0.7 and x naught is 4. So, this one is equal to 2.7. Now, we know that this quantity is since in the Disney toy example n equal to 5. So, this one is basically t 0.25 minus 3 which is equal to 3.182. Now, we need to compute this term here what we have is that we know m s residual is 0.367 n is 5 plus x naught is given 4 and you can check that you go you see the Disney toy data. You can check that x bar is equal to 3. So, this is 4 minus 3 square by 10 and this will come out to be 0.367 into 0.667. So, this is the value 0.3 which is equal to 0.3182 into 0.33. Well, so the upper bound is going to be for this quantity expected response at the point 4 is going to be 3.75 and lower bound is obtained by just replacing this plus sign by minus. This will give you 2.7 minus this quantity 3.182 into 0.33. So, this will be 1.65 and the probability that the expected response when the cost on advertisement is equal to 4 the expected response will be lie in this interval with probability 0.95 and you can go back to the you can see the original data there you will see that the actual response value is equal to 2 corresponds to x equal to 4. So, this is how we know we give confidence interval for some population parameter and here the population parameter is beta naught plus beta 1 x naught and we have given 95 percent confidence interval for that population parameter which is here it is basically the expected response at some value of for a given value of x equal to x naught. Well, so another important application of this regression model is to predict the new observation this one is a little difficult you know there is a slight difference between the expected response and what I am going to do now this says that you know next let me let me explain the things we are going to predict new observation predict prediction of new observation well what I want to do is that you know we want to predict new observation say why not corresponds to corresponds to a specific value of regressor a x equal to the difference between the previous one and this one is that in the previous problem we want to expect it response and here we want to predict the observation at the point x equal to x. So, the difference between you know the previous and this one is that see why not is nothing, but why not is nothing, but it is a new observation given the data we have the we have fitted the model and now using that fitted model we want to predict the response value at a new point. So, we want to predict why not which is basically beta naught plus beta 1 x naught plus epsilon we want to predict this one and the previous problem was we wanted to predict or we wanted to estimate expected response at x equal to x naught which is equal to beta naught plus beta 1 x naught. So, here we want to estimate why not which is equal to this quantity and in the previous example we wanted to estimate expected response which is equal to this quantity well now again if x equal to x naught then you know beta naught hat plus beta 1 hat x naught is we want to predict why not. So, we will start from this point estimator well now we define random variable psi which is equal to this is bit tricky which is equal to why not minus why not hat my why not hat is nothing, but this quantity this is equal to this is basically why not hat. Now you can check that it is not difficult to check that expected value of this new random variable psi is equal to 0 and the variance of this new random variable psi is equal to variance of why not minus why not hat which is equal to the variance of see why not hat this why not hat is this quantity beta naught hat plus beta 1 hat x naught. So, the whole thing it is a function of y 1 y 2 y n the given observation, but why not is a new observation and this one is independent of y 1 y 2 y n. So, why not hat basically involves y 1 y 2 y n and why not is a independent observation. So, that is why we can write the variance of this quantity is equal to variance of why not plus variance of why not hat. Now the variance of y 1 y 2 y n is equal to y not hat we know because just now we have computed the variance of this quantity it is not difficult to check and variance of why not is equal to sigma square. So, variance of why not hat so variance of psi is going to be variance of well sigma square plus variance of why not hat which is equal to sigma square plus sigma square by 1 by n this one we just we have proved that this is equal to x naught minus x bar whole square by sigma square x x because why not hat is nothing, but this one is nothing, but beta naught hat plus beta 1 hat x naught. So, this is sigma square into 1 plus 1 by n plus x naught minus x bar whole square by sigma sorry this is s x x this is s x x well and we know that the sampling distribution of psi psi minus expectation of psi is equal to 0 by beta naught. So, variance of psi this follows normal distribution, but now if you replace in the variance of psi this sigma square by m s residual then this is going to follow t distribution with degree of freedom n minus. So, from here we get 100 into 1 minus alpha percent minus 1 minus we call it prediction interval prediction interval for why not is why not lies between not hat plus t alpha by 2 n minus 2 into the whole thing m s residual into 1 plus 1 by n plus x minus x naught x naught x bar whole square by s x x. So, this one is the upper bound for why not and the lower bound is why not hat this plus will be just replaced by minus. So, this is t alpha by 2 n minus 2 by into m s residual 1 plus 1 by n and this quantity plus x naught minus x bar whole square by s x x. I hope you can understand because basically this quantity is nothing but this one only the sigma square has been replaced by m s residual. So, this is the you know 95 percent if you put alpha i call to 0.05 then the probability that the future observation and then the at x equal to x naught will lie in this interval with probability 1 minus alpha that is basically 0.95 and here you know of course, this interval is minimum when x equal to at the point x equal to x naught and this interval is always wider than the interval given for the expected response at the point x equal to x naught. So, here this is all for today and this is perhaps you know this is my last class on simple linear regression and next in the next lecture we will be talking about multiple linear regression. Thank you.