 So, this is my second lecture on regression models with autocorrelated errors and here is the content of this topic. We already talked about source and effect of autocorrelation in the regression model and in the previous class we started talking about this how to detect the presence of autocorrelation and we will be talking about parameter estimation in the presence of autocorrelation in the model. So, let me repeat the objective of this topic is that in simple linear regression model or in the multiple linear regression model we make several assumption on the error terms like expectation of epsilon is equal to 0 variance of epsilon is equal to sigma square and the error terms are uncorrelated and also we make assumption on the normality of the error terms that epsilon i follows normal distribution with parameter 0 and sigma square. So, this is the these are the assumption we make while fitting a simple linear regression model or multiple linear regression model using least square technique, but the problem is that you know when the data are collected sequentially in time the assumption of independence of error term may not be true that means while you are collecting the observation sequentially in time the observations might not be independent which implies that the error terms epsilon i they are not independent and there exist some sort of you know autocorrelation between the errors. So, let me write down the formal definition of autocorrelation that you know already. So, errors are autocorrelated or some term you call serially correlated means correlation between errors s steps apart are always the same that means the correlation between epsilon u and epsilon u plus s is equal to we denote this by I mean this is not 0 and this we denote by rho s and this is for s equal to 1 to like this. So, this is what we mean by the autocorrelation in the error term the error are correlated or serially correlated means correlation between errors s step apart are always the same. So, we talked about the source of this autocorrelation and the effect of autocorrelation in the previous class and we were talking about how to detect the presence of autocorrelation once you are given a time series data we suspect that there exist autocorrelation, but you have to test whether autocorrelation really exist or not. That means whether the error terms are correlated or not for that we talked about a statistical test called Darby Watson test and we could not finish that in the last class. So, we sort of repeat that thing today again, so here is the slide from my previous class that Darby Watson test. So, we want to fit a model, so this is a multiple linear regression model by using the least square technique to the observations this. This could be a time series data that means the observations are taken sequentially in time. So, what we usually assume is that we assume that this epsilon the error term follows normal distribution with 0 sigma square that means we assume that all the correlation rho s this is the correlation between errors s step apart that is 0 that is what we assume. Now, here what we want to test is that we want to test whether this assumption is justified or not. So, here is the hypothesis to test that we test the hypothesis null hypothesis that rho s is equal to 0 against the alternative hypothesis that rho s equal to rho to the power of s basically we want to test whether this is equal to 0 against whether rho s is greater than 0 or less than 0 or not equal to 0. So, greater than 0 means it is a positive autocorrelation rho s less than 0 means negative autocorrelation and not equal to 0 means autocorrelation exist. Now, why this particular form rho to the power of s that I explained this comes from the assumption that the errors are having I mean the this is the errors of first order autoregressive error. That means you can regress epsilon u on epsilon u minus 1 that is what the first order autoregressive error. Well, so we assume that the this is true for the for error terms and where z u is is again the error term for this regression model which follows normal 0 sigma square and z u is independent of epsilon u minus 1 epsilon u minus 2 or all the previous terms and it is independent of z u minus 1 z u minus 2. And to I explained know how to get this form in the previous class please refer my previous lecture for this one. This is coming from the assumption of first order autoregressive assumption. Now, to test this hypothesis we are trying to test the hypothesis whether the correlation between the errors which are s step apart is equal to 0 or that is equal to rho to the power of s. To test this hypothesis we what we do is that we consider this Darby Watson test statistic. Now, this involves you know the difference of residuals successive residuals as you can see here, but how to get them. So, here it says that you first fit a regression model using ordinary least square technique assuming that all the assumptions are true on error term and then compute the residual. Once you have once you fit this model what you get is that you get y hat is equal to x beta hat. So, you have the fitted model and once you have the fitted model you can compute e which is equal to y minus y hat. So, this is the observed and this is the estimated response and so you have all the residuals and then you form this test statistic and it is known that this distribution of d lies between 0 and 4 and it is symmetric about 2. Now, we are trying to test this hypothesis based on this and the test statistic to test this hypothesis is this one suggested by Darby Watson. Now, let me talk about the critical region you know how to decide whether to accept or reject this null hypothesis based on this d value. So, here is the first case. So, one sided test against the alternative. So, what see we started with the hypothesis like h naught that rho s is equal to 0 against h 1 that rho s is equal to rho to the power of s. Now, we are testing the hypothesis which is rho equal to 0 against rho is greater than 0. So, if this is suppose this is true then rho s is also greater than 0. That means, for all s s equal from s is 1 to anything. So, testing this hypothesis is same as testing this hypothesis. Once rho is equal to 0 if the null hypothesis is 0 then rho s is going to be 0. If the alternative hypothesis is true that means, rho is greater than 0 then the original hypothesis says that rho s is greater than 0 that means, the data has you know positive autocorrelation well. So, you have the Darby Watson test statistic value d and if d is less than d l then it says that you reject h naught, d is greater than d u then you accept h naught and if it is between d l and d u the test is inconclusive. Now, let me talk little bit about what is this d l, this d lower value and d upper value. There is a table for d table there you will get this d l and d u value for different n depending on how many observations are there. So, for different n and for different alpha the level of significance. So, say for example, n equal to 20 in our previous example on soft drink concentration cell you will get the d l value based on the different choices of alpha. So, there is there exist a table for this d low and d up value well. So, if d the observed d value is in between d l and d u then it the test is inconclusive. I talked about the significance of why we reject the null hypothesis h naught when the d value is small. So, the d value you know what is d right, d is d is e u minus e u minus 1 square by d e u square right 1 to n minus 1 perhaps and this is 2 to n no this is till n. So, if d value is small we reject the null hypothesis that means, we accept the alpha 1 and we accept alternative hypothesis that means, we say there exist a positive autocorrelation. So, when d is small there exist positive autocorrelation. Now, if you can recall the scatter plot for d i and against d i minus 1. So, if you see the scatter plot like this we say the there exist positive autocorrelation that means, e i increases with e i minus 1. So, the i th observation depends on i minus 1 th observations right there is a correlation between the successive observations. So, if the data are centered about this line e i equal to e i minus 1 that means, the successive error terms are of similar magnitude they are almost same. So, here it says the positive autocorrelation indicates successive error terms are of similar magnitude that means, they are almost same. So, the difference in residual this difference will be small. So, that is why you know the I mean the small value of d indicates the existence of positive autocorrelation. I hope this is clear. Now, let me go for the second case one sided test against alternative row. So, the meaning of this one is that the originally we started with the hypothesis that row s is equal to 0 against h 1 that row s is equal to rho to the power of s. Now, if row is negative then this row s is going to be negative that means, the alternative hypothesis says that there exist a negative autocorrelation right and this is a negative autocorrelation. So, then this row s is going to be negative that means, the alternative hypothesis says that there exist a negative autocorrelation right and this can be tested by testing the same test statistics d. So, here it says that if 4 minus t is less than or equal to less than d l you reject h naught. If 4 minus t is greater than d u you accept h naught well and if 4 minus t value is in between d u and h naught. And d l the test is inconclusive. So, similar argument. So, here basically we want to test the hypothesis and finally, we test testing this hypothesis is same as testing row equal to 0 against h 1 row is less than 0. And the final case the case of the hypothesis third case case 3 here we test h naught row equal to 0 against the alternative hypothesis that row is not equal to 0. So, it is a two sided alternative and here if d is less than d l or 4 minus d is less than d l we reject h naught. So, small value of d indicates that there exist autocorrelation and if d is greater than d u d u and h naught is less than d l 4 minus d is greater than d u then you accept h naught. So, the high value of d indicates that there is no autocorrelation in the error and otherwise the test is inconclusive. So, now let me consider an example. So, this is the example I took in the previous class also this is called soft drink concentrate cells. Here we have two one regressor variable that is annual advertising expenditure and y t is annual sales. So, this is the regressor variable sorry x t is the regressor variable response variable is y t. So, we have the data x t y t sequentially in time. So, we have the data for 20 years starting from say 1960 to 1979. So, that is of course, I mean this is then this is what we call the time of the year when we have the data for 20 years. So, this is the regressor variable sorry x t is the regressor variable response variable is y t. So, we have the data x t y t sequentially time series data such data are called time series data. So, initially you know ignoring that whether the basic assumption while fitting a straight line model between y t and between y and x using the ordinary least square technique. You forget about you ignore whether the assumptions are true or not you just fit a model between y and x. So, this is the fitted straight line model between y and x. So, once you have the fitted model you can compute the residuals. So, e t is nothing but the observed response value and the estimated response value at e r t. So, once you have this residuals now you might be interested to test whether because since it is a time series data we suspect that you know there might be autocorrelation present in the in the in this data. So, now we can go for Derby Watson test to test whether autocorrelation exist or not well. So, we have fitted this model say y hat y t is equal to 1 6 0 8 0.5 0 8 plus 20.09 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 1.5 x t and then we can compute the residuals. So, now we use Derby Watson test for this testing that rho is equal to 0 against say h one rho greater than 0. why we are doing this, because since the response variable y t and the regression variable x t are time series data, they are taken over time, a time series data, so we suspect that that autocorrelation may be present. See in if it is not a time series data, then we do not go for autocorrelation test, so well. So, this is the hypothesis we want to test and we have the residuals, so we computed the residuals here. So, here you have the residuals and then you compute the Darby Watson test statistics D, which is equal to here E i minus E i minus 1 or E t minus E i and t minus 1 square i is from 2 to 20 by E i square i is from 1 to 20. So, you can check that this one is 1.08 and now you have to find D l value from the table for n equal to 20, because we have 20 observations here. So, you can check from the table that D l is equal to 1.20 and D u is equal to 1.41 for n equal to 20 and the level of significance alpha is equal to 0.05. So, what you see that the observed value this D, which is equal to 1.08 is less than D l, which is equal to 1.20. So, small D value indicates that there exist positive autocorrelation. So, since this is true we reject H naught, H naught says there is no autocorrelation. So, we reject H naught and conclude that the errors are positively autocorrelated. So, this is one example to illustrate the Durbin Watson test. Now, so given a time series data we suspect that there may exist autocorrelation. So, what we do is that we fit a simple straight line model using the ordinary least square technique and once we have the fitted model we find the residuals and based on using those residuals we compute the Durbin Watson test statistics and see whether autocorrelation is present in the data or not. Suppose the result of Durbin Watson test is that autocorrelation is present in the time series data you are given. So, then the next issue is how to estimate the regression coefficients in the presence of autocorrelation. So, we will talk about that now the parameter estimation method. So, in the presence of parameter estimation method in the presence of autocorrelation in error. So, this is called Cochrane and Orkut method to estimate the regression coefficients. It says that consider the simple linear regression model with first ordered autoregressive error. That means we are considering a model simple linear regression model y t is equal to beta naught plus beta 1 x t plus epsilon t, but here the epsilon t's are not independent they are first order autoregressive error. That means where epsilon t can be regress on epsilon t minus 1. So, epsilon t is equal to rho epsilon t minus 1 plus z t and this z t is normally distributed with mean 0 and variance sigma z say constant variance. This is just to distinguish from sigma square and this is called sigma z square and they are independent. And here this rho is called this rho is autoregressive parameter or autocorrelation parameter might be. Now, how to fit this model because here you know you cannot apply ordinary least square technique because the assumption on epsilon t that is this follows normal 0 sigma square with independent this is not true this is not true here. So, we cannot apply ordinary least square technique here. So, what you do is that we will transform this data y 2 y t to say y t dashed. So, we transform the response variable y t to y t dashed which is equal to which is equal to y t minus rho y t minus 1. Then let me check what is this y t prime now. So, y t prime is equal to y t minus rho y t minus 1 which I can write I know that y t is equal to beta naught plus beta 1 x t plus epsilon t and this minus rho what is y t minus 1 this is beta naught plus beta 1 x t minus 1 plus epsilon t minus 1. So, this can be written as now. So, beta naught into 1 minus rho plus beta 1 into x t minus rho x t minus 1 plus epsilon t minus rho epsilon t minus 1. Now, I can write this as beta naught dashed plus beta 1 x t dashed plus z t if you can recall we assume that this errors are first order auto regressive error. That means epsilon t minus rho epsilon t minus 1 is equal to z t. So, where z t are independent with mean 0 and variance sigma square z well. So, here now we have transform the error term epsilon t to z t where z t now this transform error now error z t are independent. So, now we can apply the ordinary least square technique to this transform data, but the problem here is that this y t prime and x t prime this transform time series data cannot be used directly as this two things this y t dashed which is equal to y t minus rho y t minus 1 this involves an unknown parameter rho we do not know the value of rho right and x t prime which is again x t minus rho x t minus 1 this two are function of unknown parameter rho. So, we cannot take this transformation right now, but let us see how to compute this unknown parameter how to estimate this unknown parameter rho. So, this rho is called auto correlation parameter or auto regressive parameter. So, if you can recall that this rho is basically epsilon t is equal to rho epsilon t minus 1 plus z t. Now, one way to compute or estimate this rho is that see we are given only the data x t and y t nothing else and it is known that their time series data. So, what you have to do is that you just fit simple linear regression model on x t y t you compute residuals and then we know that the residuals are sort of observed value see e t e t is observed value of epsilon t right and then we can regress e t on e t minus 1 and from there we can compute the value of rho. So, let me explain that part now. So, first what you do is that you fit y t equal to beta naught plus beta 1 x t plus epsilon t. So, using ordinary least square technique. So, using ordinary least square technique means we are assuming that this assumptions are true or you are ignoring this all this assumption at this moment. So, you just fit a simple linear regression model between x t and y t and obtain the residuals e i once you have the residuals then what you do is that you regress e i on e i minus 1 that is you fit a model like e i is equal to rho e i minus 1 plus z t see we do not know this epsilon t right and ith residual is sort of estimate of ith error term. So, we are regressing e i on e i minus 1. So, we fit this model we know all this residuals. So, what is this rho value now how to get a estimate for rho that can be obtained by minimizing this quantity. So, we will go for the least square estimate we will minimize this quantity say s rho s rho is say e i minus rho e i minus 1 square and we estimate rho in such a way that this is minimum. So, which essentially says that you differentiate this s rho with respect to rho this equal to 0 implies implies that summation e i minus rho e i minus 1 into e i minus 1 is equal to 0. I am differentiating with respect to rho and this gives me rho is from i is from 1 to n right i is from 1 to n. So, rho is equal to summation e i e i minus 1 i is from 2 to n now I have to take because of i minus 1 by summation e i minus 1 square i is from 2 to i is from 2 to n and so this is the estimated value of rho and this can be written as finally, I can write. So, the least square estimate of rho which is let me call it rho hat which is equal to e t e t minus 1 t is from 2 to n by summation e from 2 to n e i minus 1 I can replace this as e t square t is from 1 to n. So, this is how we estimate rho and now we use this rho to transform the data. So, using this estimate of rho we obtain y t prime is equal to y t minus rho hat y t minus 1 and x t prime is equal to x t minus rho hat x t minus 1 right. And then you apply ordinary least square to the transform data you can do this because you know that y t prime is equal to beta naught prime plus beta 1 x t prime plus z t where z t follows all the conditions of Goss Markov theorem I mean. So, this follows normal 0 sigma z square and then independent. So, that is why you can apply ordinary least square technique to the transform data and once you have you done with you know ordinary least square I mean once you have the fitted model like y t dashed hat is equal to beta naught dashed hat plus beta 1 hat x t. So, this is the fitted model. So, once you have the fitted model where these parameters are obtained using ordinary least square technique again you compute what you do is that you compute the residual E residual for this model now. So, now you use Darby Watson test to the residual obtained from the re parametrize. So, to check whether still. So, you have applied ordinary least square technique to the transform data y t prime x t prime they are also time series data. Now, again you need to check whether autocorrelation still exist on the transform data. So, if your Darby Watson test indicates no autocorrelation no autocorrelation in the errors then no additional analysis is needed, but if Darby Watson test indicates there is autocorrelation errors then no additional analysis is needed, but if Darby Watson test indicates there is autocorrelation errors then no additional errors is required. So, that means you apply you need to check whether in the transform data you still have the autocorrelation using the Darby Watson test and if you see in the transform data there is no autocorrelation in the errors for this transform data you stop there is no additional analysis is required, but if you see that Darby Watson test indicates that there is exist there is autocorrelation in the error for the transform time series data then you have to repeat the same thing once more and you know you may go for two iteration maximum and there you have to stop well. So, here we talked about the data which are collected sequentially over time and they are called time series data and in time series data generally we suspect that the observations are not independent that essentially same like the errors are not independent they are correlated. So, we need to test whether errors are autocorrelated or not for that we have learnt you know Darby Watson test and the residual and residual plots and all these things and if you see that you know the Darby Watson test results or indicate that autocorrelation exist in the data then we have learnt a technique to technique how to estimate the parameters in the presence of autocorrelation in the model. So, that is all for today. Thank you.