 । । । । । । । । । र साच़ाँ बंगुकी ऐ,ंज ये चीी तुहात कै जुंते क्या गश्वें at the most common आबे उहाँ उहाँ उहाँ, से उहाँ регوس मैं लउजनते अज's most common analysis that is those using� की नग 꽃 इवासmarket on various aspects of regression analysis. लोग स्वोगकी, नबनंता बन मेर्च्ठेश के two variants of regression analysis, so the outcome is, बिस्ठ of it is going to be first will define, what is regression. there are two kinds multiple regression verses simple regression, we will talc about theṛandum error and the regression coefficients. the least squares estimate of regression coefficients. the expected value of glist squares is तexIM looks a regression coefficient its variance the estimate of variance for random error,the estimate of variance for random error and at the end it will give a slide on most commonly used notable notation. This slide will be useful in future for reference. Ok so let us start. As I said, suppose we have a response variable y and some independent variable x1, x2, xr and you know that there is some relationship between y and x1, x2, xr or you at least suspect that there might be a relationship between y and x1, x2, x3, xr. The simplest relationship that can exist between them is a linear relationship which can be expressed as y is equal to β0 plus β1x1 plus β2x2 plus and so on and so forth, β or xr plus epsilon. This epsilon is a very important point. In the reality when we get the data, we cannot be always sure that the relationship will be exact like this. If you do not consider epsilon, this relationship is a mathematical relationship. When you add an epsilon quantity which is called a random error, epsilon expresses represents a random error in this relationship. This is where the relationship becomes random or statistical in nature. Now if r is equal to 1, this relationship is called a simple linear regression. When r is equal to 1, the relationship y is equal to β0 plus β1x1 plus epsilon is called a simple linear regression. In general with r independent variables, it is called multiple regression and the notation wise, β1, β0, β0, β1, βr are called regression coefficients and they are generally unknown. They need to be estimated from the data. Then you have a random error. Generally, it is assumed that the random error has an expected value of 0. There is a random error epsilon which has a expected value 0. What it means is that on the average, y actually equals to this value on the average. But otherwise there is a plus or minus error in it. This plus or minus error is represented by epsilon and that we can say that it is plus or minus average is by saying that its expected value is 0. So in other words, if you recall our previous sessions, we can say that expected value of y given x1, x2, x3, xr is actually β0 plus β1x1 plus plus plus βrxr because the expected value of epsilon is 0. There is no epsilon here. So the regression coefficients β1, β0, β1, β2 and xr need to be estimated given the values of x1, x2, x3, xr. It means that the independent variable will be given fixed values for us, not a random variable in this particular case and y is going to be random variable because of the randomness of epsilon. First, we will discuss in detail the estimation procedure for β0, β1 and sigma square which is a variance of epsilon, variance of error through this case of simple regression that is y is equal to β0 plus β1x1 plus epsilon assuming that expected value of epsilon is 0 and variance of epsilon is sigma square. What estimation we could have done? The most commonly used estimator is called least squares estimator. What are we really trying to do? Let us take the case. This is x and this is y axis. Suppose we have a few data points which go like this and you have to fit a line through it. You have already done this exercise in algebra and this line we fit in such a way that the distance between the actual line and the actual value is minimized. This is called least square estimator. So we would like to find estimate β0 and β1 in this relationship in such a way that β0 and β1 minimizes the estimate of β0 and β1 minimize the squared error value between y and its estimator. So let us denote the estimator to differentiate between the actual values β0 and β1 and its estimator we call the estimators a and b. So what we are trying to do is we want to minimize the sums of squares of yi minus a minus bxi whole square. I guess you already know why do we take a square because if we do not take a square the sum of the distances that you calculate that is without the square if you take yi minus a minus bxi the best is when it becomes 0 if you take a mean value. So the idea is that we square the distance so that we remove the sign of the difference between y and a plus bxi and then we take a square of it. And now we try to find a and b which would minimize the sums of squares SS, SS is called SS because it is sum of squares it is a sum of squares. So the easiest way of doing it is by taking a partial differential partial differentiation with respect to a. So here we are okay so we take delta SS by delta a which will be given by minus 2 summation yi minus a minus bxi you equate it to 0 similarly you take delta SS by delta b and you equate it to 0. And this is a very simple simplification you will come to know that when you do this little algebra a turns out to be y bar minus bx bar and b turns out to be which looks a little bit complicated but as we go on you will recognize this term it is summation of xi yi minus nx bar y bar divided by summation of xi square minus nx bar square if you look at it very carefully this comes very close to correlation coefficient but how to derive it I leave it to you I think it is a good exercise to simplify this to get to this equation. Now we come to what is the distribution of a and b remember that now your yi is a this whole thing is estimated using the value sorry let us start the pen this whole thing is estimated using value yi and y bar remember that xi and x bar are given values so they are not random variables it is the yi which is a random variable and therefore a and b now are random variable and we must know what is distribution like as we had done in the past while working out the estimation theory and the hypothesis testing we need to know the distribution of this random quantity which you are going to use as an estimator so how to find a distribution of a and b first we make an assumption on the distribution of y remember that y is defined as beta 0 plus beta 1x1 plus epsilon so far our assumptions were only that expected value of epsilon is 0 and a variance of epsilon is sigma square remember just to remind you that this sigma square does not depend on the data value i ok so this is the relationship with respect to i is equal to 1 to n but this sigma square is not dependent on i ok so in this relationship now only we are adding an assumption there is no distributional assumption made so far no distribution assumed for epsilon only so far ok so far we have not made any assumption that is being made now now we are saying that suppose epsilon is distributed as a normal distribution with mean value 0 and the variance common variance sigma square for all i is equal to 1 2 3 etc n why this is a normal because it is very common to and it is very well known fact that by and large the errors are distributed as normal distribution it is a very old story that it was the Galileo who made so many observations of stars and when he found that every time when he makes an observation there is a minute error and that error after 200 years it was Gauss who found that this error behaves in a very perfect bell shaped curve and it was called a Gaussian distribution and therefore it has become a normal distribution but that is a side story so any error to be assumed as a normal distribution is a natural process so here we assume it as a normal distribution with mean 0 and variance sigma square and therefore it implies that our y i for i is equal to 1 2 up to n is also distributed as a normal distribution with a mean mean beta 0 plus beta 1 x i and a variance of sigma square and now we can find an estimated value of b because b you please recall the previous slide the estimator of a involves b therefore first we must try to find the expected value of b and use it in the estimation expected value of a and therefore we come here and we find that expected value of b can be found by you remember that these are all the constant values even values to us therefore it is only the y i which is a random variable therefore this becomes this now if you replace y i by beta 0 plus beta 1 x i it will reduce down to the same thing shall we do it here let us quickly do it I have to move to pen I do that now we say that this quantity we simplify it here so it becomes summation of x i minus x bar expected value of y i is beta 0 plus beta 1 x i divided by summation of x i square minus n x bar whole square so if you simplify it you will find that it comes to summation x i beta 0 minus x bar beta 0 plus as we said it will be beta 1 x i square minus beta 1 x i sorry x i times x bar divided by the quantities constant which is summation x i square minus n x bar square and this you will find we will simplify to summation x i minus x bar whole square times beta 1 divided by summation of x i square minus n x bar square this you can see by simply bringing the summation this is also beta 0 n x bar and here when you bring the summation this also becomes n x bar so this quantity cancels and this quantity brings out the beta 1 n the x i square minus summation x bar square so it will bring you n x bar square and therefore this will become beta 1 but this is the quantity in which you have to realize so we have the this quantity will cancel out and this quantity results into this value and therefore it is beta 1 and once you put this into it this is very simple because this basically gives you the expected value of y bar which is nothing but beta 0 plus beta 1 x 1 bar and then you will again make minus beta 1 x bar and therefore this will become beta 0 so this is how the distribution in the distribution of A and B we find that expected value of A so expected value of B is beta 1 and expected value of A is beta 0 just go back and think a little bit because epsilon is assumed to be normal with 0 mean and variance sigma square y i becomes normal with the expected value of beta 0 plus beta 1 x i and variance sigma square and you can see that the estimate value of B is also a function of y with certain constant and estimated or the expected or the estimate of beta 0 A is also a function of y only rest of it is a constant you will find that these two are also distributed that is A and B random variables are also distributed as normal so all we need to know is its expected value and its variance so in the next case we will go we are going to find out the variance of B and variance of A so the variance of A and B again you have to follow the same formula variance of A and B is a variance of summation i is equal to 1 to n xi minus x bar y i divided by summation of xi square minus n x bar whole square everything square because when you do the variance the constant is squared what formula we have used here if you recall is that if a variance of a random variable x is sigma square then variance of random variable ax is a square sigma square so this formula is being used here therefore the denominator which is only a multiplier remember xi is a given value so it is a constant value we understand it is not a random variable so it only sits comes out as a squared of it 1 divided by that as a square of it and then you have to take the variance of yi which is multiplier and therefore that is also going to be and another formula if you know that variance of x is sigma 1 square and variance of y is sigma 2 square then and x and y are independent or to be very clear not correlated in that case variance of x plus y is variance of x plus variance of y so using that formula we can simplify this by stating that this is equal to summation of xi minus x bar whole square this part comes out because of this and then you have a variance of yi and you know that variance of yi this part is equal to sigma square so when you simplify it it comes to sigma square divided by sxx and this is a notation we would like to introduce here sigma xx is equal to summation of xi square minus or it is the same as summation of xi minus x bar whole square i runs from 1 to n this is a new notation we are including here and therefore this becomes the variance the variance of a can also be derived in a similar way and the variance of a can be found to be the same thing please note that this can also be written as sigma square summation xi square divided by n sxx sum of squares of residuals now one thing is important is we have there are actually three unknown parameters there are three unknown parameters beta 0 beta 1 and sigma square we estimated this by a we estimated this by b what about this this is the question we want to answer and that we are going to do that if yi is an observed value and a plus b xi is an estimated value then we define residual ri as yi minus a minus b xi then sum of square of residual is defined as summation of ri square and you can make out that y is a normal random variable if you take a plus b xi this is also a normal random variable therefore the difference should also fall a normal random variable with a mean 0 and therefore summation of ri square will follow a chi square distribution and the degrees of freedom will be n minus 2 because we had n data points we had n data points and we have already estimated beta 0 and beta 1 two parameters estimated therefore degrees of freedom comes to n minus 2 so this follows chi square n minus 2 and then the expected value of sums of squares of residual divided by n minus 2 is sigma square finally we introduce some of the notations here sxy is summation of xi minus x bar multiplied by yi minus y bar similarly if I say sxx it is means that it is summation of xi minus x bar whole square and syy is a summation of yi minus y bar whole square and in that case b that is the estimate of beta 1 is sxy divided by sxx please see that this is only a comma and not a dash a is estimated as y bar minus b x bar and ssr that is sums of squares of residual is estimated as sxx times syy minus sxy whole square divided by sxx the distribution of least squares parameter under the assumption that the errors are distributed as normal with sigma square then the estimate of beta 0 which is a is a normally distributed with a expected value beta 0 and the variance as sigma square multiplied by summation xi square divided by nsxx the estimate of beta 1 is distributed as normal with expected value beta 1 and the variance of sigma square divided by sxx and sums of squares of residual divided by sigma square is distributed as chi square with n minus 2 degrees of freedom and therefore you can write that ssr sums of squares of residual divided by n minus 1 I am sorry n minus 2 its expected value is equal to sigma square these are the 2 tables worth remembering and worth understanding this is the crux of today's lecture so in summary we define the concept of regression in the case of simple regression equation we estimated the regression coefficients through least squares estimate arrived at their expected value and variance estimated the error variance introduced commonly used notation for least squares estimate of regression coefficient and their distribution