 today we will start a new topic called regression models with autocorrelated errors. Here is the content of this module, source and effect of autocorrelation and detecting the presence of autocorrelation and then you know if you have autocorrelation in the model, I will explain what is this autocorrelation and how to parameter how to estimate the parameter of the model, okay. So, let me give the idea and what is the objective of this module, given a set of data say x i, y i while fitting a simple linear regression model say y equal to beta naught plus beta 1 x plus epsilon. So, we make several assumption on the error term, we assume that expectation of E equal to sorry expectation of epsilon is equal to 0, variance of epsilon is equal to sigma square, we assume that the errors are uncorrelated and also we make a normality assumption on the error to for the testing of hypothesis and you know for confidence and interval of the parameter. Now, if the data set say x i and x i, y i is collected sequentially over time, then the assumption of this independent error is not guaranteed, okay. So, in that situation I mean when the data are collected over time, those type of data are called you know time series data and then how to deal in such situation when the errors are correlated, okay. Let me you know write down the objective of this module clearly, so I am talking about model say very simple and simple linear regression model y i equal to beta naught plus beta 1 x i plus epsilon i and you are given the data sets say x i, y i, okay. So, i is from 1 n and the assumption we make that basic assumptions are we assume that expectation of epsilon i is equal to 0, variance of epsilon i is equal to sigma square constant variance and also we assume that you know the errors are uncorrelated. So, I can write in this form that covariance of epsilon i epsilon j is equal to 0 because the expectation is also 0, well and also we make the assumption. So, this is the first one, the second one is that we assume that this epsilon i follows normal 0 sigma square and they are independent. Now, what I said is that now when this data say I will write y t instead of y i, y t x t are collected sequentially in time, the usual assumption of independence of errors is not guaranteed. Anyway, such data are called time series data, okay. Let me just you know to make this part clear what I mean by you know data are collected sequentially in time. Let me just give one example to clear your doubt. This is called you know soft drink concentrate data and this one is the regressor variable sorry this is the regressor variable x and this is the sales amount y and this is the sales amount and this one is the x is expenditure on advertisement. So, this is in thousand dollar unit and so we have data on the amount of money you know used for the advertisement and the sales amount for 20 years. So, this data you know this is let me call this x t y t. So, this x t and y t are collected over 20 years. So, this is a time series data, okay. So, when the data are collected sequentially in time the usual assumption of independence of error is not guaranteed. So, here we say that errors are autocorrelated or also we called serially correlated. So, that means, that errors are correlated or serially correlated means correlation between errors s step apart are always the same. So, I am talking about the correlation between the correlation between epsilon i or say epsilon t and epsilon t plus s. This is same for all t the correlation between errors s steps apart are always the same and we usually denote this by rho s for s equal to 1, 2, 3 like this. So, the correlation between residuals say 1 or 2 or 3 steps apart is called lag 1 or 2 or 3 serial correlation. So, if the s is equal to 1 that means, when you are considering the correlation between the errors 1 step apart that is called lag 1 correlation or serial autocorrelation or serial correlation. So, what is the source of this autocorrelation source of autocorrelation the primary source cause of autocorrelation in regression problem involving time series data is failure to include 1 or more important regression regressors. So, it says that the primary cause of autocorrelation in regression problem involving time series data is failure to include 1 or more important regressors in the model. So, what we mean by this one is that let me consider the example of soft drink concentrate data. So, there we are trying to regress the sales amount on the amount of expenditure for advertisement, but you know the growth I mean the population increases over time and this growth in population has you know influence in the sales amount. So, the population size is another important variable which has influence on sales amount. So, if you do not include the population size or increase in the population size you know that variable in the model then you can expect autocorrelation in the time series data. So, it says that the primary cause of autocorrelation in regression problem involving time series data is failure to include important regression variable in the model. So, we understood the source of autocorrelation why it happens and then now let me talk about the effect of autocorrelation. So, if effect of autocorrelation. So, if autocorrelation is there in the model that means what is the meaning of this that epsilon i and epsilon j correlation between them is not equal to 0. Well, so if this happen if the errors are correlated then what is the effect of that while fitting a simple say simpler multiple linear regression model. So, if you are fitting a multiple linear regression model say y equal to x beta plus epsilon then we know that the least square estimate is beta hat which is equal to x prime x inverse x prime x inverse x prime y. So, this is obtained using the least square estimate and now if the if you consider the basic assumption on the model that you know this epsilon i follow normal distribution with mean 0 variance sigma square and the independent then the the conditions of the Gaussian Markov theorems are satisfied. So, the the beta hat we get the beta hat is equal to x prime x inverse into x prime y that is the best linear unbiased estimator. So, but here once the condition that you know here the errors are correlated. So, here that condition is not true. So, errors are correlated here. So, because of the violence of this violation of this condition that you know errors are uncorrelated here beta hat is unbiased unbiased, but beta hat is not minimum variance. So, so because of this problem like you know errors are correlated in this case. So, the least square estimate beta hat is not the best linear unbiased estimator. Of course, you know we know if if the variance of epsilon is equal to say sigma square v where v cannot be written as sigma square i, then we know how to get the best linear unbiased estimator using the generalized least square technique. The second effect is when the errors are positive the errors are positively autocorrelated I will I will say what is what I mean by this one. Then the MS residual may seriously under estimate sigma square because we know that MS residual is an unbiased estimator for sigma square. So, what is the consequence of this one that you know MS residual under estimate sigma square the consequence of this one is that the variance of suppose you are fitting the model y hat simple linear regression model beta naught hat plus beta 1 hat x is the fitted model. So, we know that variance of beta 1 hat is equal to MS residual by s x x. So, the standard error of beta 1 hat is the is equal to the square root of this quantity MS residual by s x x. Now, since this one is small the standard error is going to be small and the consequence of this one is that when we compute the confidence interval for say beta 1 hat. So, the confidence interval if you refer my first module the confidence interval for beta 1 is beta 1 hat plus t alpha by 2 n minus 2 degree of freedom into standard error into standard error of beta 1 hat and the lower bound is beta 1 hat minus t alpha by 2 n minus 2 standard error of beta 1 hat. Since this one is the small then this confidence interval are this confidence interval is short. So, you will get a narrow interval for the parameter and which might not be the true interval for the parameter beta 1 and also in the regression model we test the hypothesis like to check the significance of beta 1 or to or the significance of the model or the linear term we check the hypothesis that h naught is say beta 1 equal to 0 against the h 1 that beta 1 is not equal to 0 and you know that the test statistic to test this one is t equal to beta 1 hat by standard error of beta 1 hat. So, see when there exist positive autocorrelation in error the MS residual under estimate sigma square and the consequence of that is that the standard error of beta 1 hat is small. So, this one is small so t is large so this is large so that means that beta 1 may be significant. So, you will get the result that beta 1 is significant when it is really not because of the positive autocorrelation in the data. So, these are the effect of autocorrelation. Now, let me talk about how to detect autocorrelation. So, detecting autocorrelation so the first technique is you know the residual plot residual plot is useful for the detection of autocorrelation. So, what you plot is that you plot so given the data set x i y i or x t y t you fit a simple linear regression model. So, once you have the fitted model you can compute the residuals and then you plot let me call it x t y t then you plot residual e t against t and if you see that your plot is like this say for example this is the residual plot e t against t here what happen is that till this time point you see all the residuals. So, this is the line e equal to 0 till this time point all the residuals are negative and from here to here you can see all the residuals are positive and again the residuals are negative in this segment. So, residual of identical sign occur in a cluster so if the residuals of identical sign occur in a cluster then this indicate this indicate the positive autocorrelation may be I will explain why this is the why this is true before that you know let me give some more residual plots like you know if the this residual plot is e i or e t against e i minus 1. So, here we are sort of trying to find the relation between e i and e i minus 1 that means lag 1 correlation. So, if you see the plot is like this say for example then this one is sort of lower left to upper right pattern indicates positive lag 1 autocorrelation. And if you see the scatter plot of e i against e minus 1 is like this say e i e i minus 1 it is say for example like this that means this is upper left to upper right pattern to lower right pattern this indicates negative lag 1 autocorrelation. And the other one is so this these plots are like to detect lag 1 autocorrelation similarly for lag 2 you have to plot e i against e i minus 2 and see how they are related. So, e i e i minus 1 if you see the pattern is like this then this indicates that errors are uncorrelated or unrelated. So, this is the graphical technique to identify the error of the pattern. So, this is the existence of autocorrelation and specially for lag 1 autocorrelation you have to plot this is basically you are trying to find the relation between e i and the previous residual. And as you see you know e i now this sort of you know if you fit a straight line if you fit a model between if you regress e i 1 e i minus 1 you will get a straight line model like e i is equal to some rho into e i minus 1. And this clearly says here rho is positive and e i increases sort of as e i minus 1 increases. So, they are very similar in magnitude. So, this is what the positive lag 1 autocorrelation a negative lag 1 this indicates negative lag 1 autocorrelation. And this one says that there there is no correlation between between e i and e i minus 1 that means the errors are uncorrelated. Now, we will talk about one statistical test to test the presence of autocorrelation the test is called the Durbin Watson test. Well, suppose we wish to fit the model say y u equal to beta naught plus beta i x i u plus epsilon u u is from 1 to n by least squared technique. So, this is the result of the observation to to observation say y u. And then I am talking about multiple linear regression model x 1 u x 2 u and something x k u. So, what we do is that we usually use the usually assume that this epsilon u follows normal distribution with 0 sigma square and there i i d. That is what that means that we are assuming that the lag is autocorrelation is equal to 0. That means correlation between the errors s step apart that is equal to 0. So, if you want to use least squared technique to fit this model you have to assume this that means you are assuming this. Now, we want to what we want to do is that we want to see if this assumption is justified for the given data. So, for that what we will do is that we will test hypothesis we will test this hypothesis H naught that rho s is equal to 0 against the alternative hypothesis H 1 that rho s is equal to 0. So, rho is equal to rho to the power of s this rho is not equal to 0 and it is a modulus value is less than 1. Now, what I will do is that know why particularly we are considering this alternative how this alternative comes we will talk about that little bit. So, if the null hypothesis is accepted here in our test we will be talking about one test procedure using the Darby Watson test and if the null hypothesis is accepted here that is rho s is equal to 0 that means there is no autocorrelation in the error and here we wrote the alternative hypothesis rho s is equal to rho to the power of s. Now, what I will do is that I will try to justify the choice of this alternative hypothesis. So, this comes this alternative hypothesis comes from the assumption that the errors are error follow this model epsilon u is equal to rho epsilon u minus 1 plus z u that means the errors are first order auto-regulation. So, where this z u first order auto-regulation means there is a linear relationship between epsilon u and epsilon u minus 1 where z u follows normal 0 sigma square and this z u is independent of epsilon u minus 1 epsilon u minus 2 and of z u minus 1 z u minus 2 like this. So, if the errors are first order auto-regulative error then I can write this as z u minus 1 epsilon u in this form. So, my epsilon u I took this is epsilon u is equal to rho epsilon u minus 1 plus z u I took. Now, I can write this one as rho epsilon u minus 2 plus z u minus 1 I am just replacing epsilon u minus 1 by this 1 plus z u. So, this can be written as rho square plus sorry rho square epsilon u minus 2 plus rho z u minus 1 plus z u. So, again if you replace this epsilon u minus 1 by this quantity there is rho epsilon u minus 3 plus z u minus 2 plus rho z u minus 1 plus rho z u minus 1 plus rho z u minus 1 plus z u. So, what we will get is that we will get rho to the power of 3 epsilon u minus 3 plus rho square z u minus 2 plus rho z u minus 1 plus z u. So, ultimately you can write this as again you replace epsilon u minus 3 using this formula you can write this as rho to the power of k z u minus k k is from 0 to u you can check that. So, this is what the epsilon u in terms of z u z. So, expectation of epsilon u is equal to 0 because expectation of z is equal to 0. What about the variance of epsilon u now? The variance of epsilon u is 0. So, they are all independent z z i is independent. So, you can write this the variance of this one as 1 plus rho square plus rho to the power of 4 and like this into sigma square because the variance of z u minus k is sigma square and they are independent. So, you can write in this form and this can be written as sigma square by 1 minus rho square and similarly you can check that the covariance of epsilon u and epsilon s plus u. So, I am trying to find the correlation between if the errors are first order autoregressive what is the correlation between epsilon u and epsilon s plus u. So, you can check the covariance is equal to rho to the power of s sigma square 1 by 1 minus rho square. So, this is the covariance and since this is the covariance and then it is clearly the correlation between epsilon u and epsilon s plus u is equal to rho to the power of s. And here as you see now the epsilon u which is first order autoregressive they follow normal distribution with mean 0 and variance sigma square by 1 minus rho square and under the null hypothesis the null hypothesis is under h naught that rho equal to 0 under this null hypothesis this sigma. So, epsilon u they follow normal 0 sigma square you put rho equal to 0 here normal sigma square and the correlation also become 0 because rho equal to 0 the correlation between them is 0. So, the independent I mean of course, that is yeah. So, the under null hypothesis epsilon u follow normal 0 sigma square and they are independent well. So, we understood the significance of this alternative hypothesis now. So, we are testing the hypothesis that h naught is rho s is equal to 0 against the alternative hypothesis h 1 that rho s is equal to rho to the power of s. So, and we have checked that this alternative hypothesis in the assumption that the errors are first order autoregressive well then rho s is equal to rho to the power of s this is the correlation between sigma u sorry epsilon u and epsilon u plus s. Now, to test this hypothesis to test h naught against h 1. So, what we do is that we fit the model say y equal to x beta plus epsilon assuming that all the basic assumptions are true and using the least square technique and then once you have the fitted model you can compute the residuals and compute compute the residuals E i and then once you have the E i it does not matter whether the basic assumptions are true or not. Now, you can check whether there is a autocorrelation in the error term or not by using the test. We then form the Durbin Watson test characteristic that d equal to E u minus E u minus 1 square u is from 2 to n by E u square u is from 1 to n. So, given assumption that the error set of data you fit the model you find the residual and then you compute the Durbin Watson test statistic and then based on this test statistic we will now test this hypothesis whether there exist autocorrelation in the data or not. So, the distribution of d lies between 0 and 4 and the distribution is symmetric about 2. So, we know the test statistic to test this hypothesis and now we will be talking about you know what are the critical regions when to reject and accept. Say first case one sided test against the alternative that rho is greater than 0. So, basically we are testing here for lag 1 that h naught rho equal to 0 against h 1 rho is greater than 0. So, you compute the test statistic d and then if d is less than d l I will say what is this d l is less than d l you reject h naught. So, there is a table for this Durbin Watson test statistic. So, the table is given for it is only it requires the number of observations you have. So, for that soft drain concentrate data there we had the number of observations you have 20 observations and then from the table you have to see the d l and d u value corresponds to n equal to 20. So, if d is less than d l we reject h naught if d is greater than d u we accept h naught I will explain why suddenly this a critical region and if d l is less than d and less than d u the test is n equal to 0. Now, what happen is that if the d is small if so given a data you fit a model using the ordinary least square technique and you get E I and once you have the residuals you can compute the Durbin Watson test statistic. So, small value of d implies rho is equal to 0 that means small value of d indicates there is no you reject this one that means you accept this one. So, small value of d indicates that auto correlation exist in the model. So, if if d is small you are rejecting this that means you are accepting this well acceptance of rho is greater than 0 means the data has positive or the error has positive auto correlation. So, let me just explain this part why why why this is true. So, the positive auto correlation when it is positive auto correlation you just recall the graph we are plotting E I against E I minus 1. So, this is the case when it has positive correlation the positive auto correlation indicates successive error terms are of similar magnitude and the difference in residuals E I minus E I minus 1 will be small. So, this is the case when it indicates the existence of positive correlation. So, here you can see you take a point and this is the E I minus 1 value and this is. So, the E I minus 1 and E I minus 1 will be E I their almost of similar magnitude that is why you get a you know all the points are centered about the line x equal to y or the line this is centered about E I equal to E I minus 1. So, points are centered about this line means E I and E I minus 1 are very similar and since they are similar the difference is small. So, once the difference is small you now recall the Derbys-Watson test statistic D that in that involves the difference. So, this if the difference is small the D is going to be small and once D is small that implies the existence of positive lag 1 auto correlation and that is why we reject the null hypothesis and accept the alternative hypothesis. So, I hope you know this will make clear why this is the critical reason as I told you there is a table I will talk in the next class there is a table for D L and D U value for different N and for different alpha and I will talk about the other cases also in the next class. So, we need to stop now we will continue with the Derbys-Watson test with some example to illustrate the Derbys-Watson test in the next class. Thank you.