 So, today we will start a new module called transformations and wetting to correct model in adequacy. Here is the content of this module, it consists of variance stabilizing transformations, to linearize the model, analytic models to select a transformation and finally, generalized and weighted least square. So, before I start talking about this module, I want to talk about the objective of this module. So, for this you know you first recall the simple linear regression model y equal to beta naught plus beta 1 x plus epsilon, where the epsilon is the error term and similarly, in the multiple linear regression model, we have capital Y equal to x beta plus epsilon and while fitting the simple linear regression model or multiple linear regression model, we make some assumptions. The first one is the error term say epsilon i is the error term say epsilon i has expected value is equal to 0 and variance is equal to sigma square and they are uncorrelated and also we assume that this error terms epsilon i, they follow normal distribution. So, normal with mean 0 and variance sigma square and epsilon i are independent and identically distributed random variable. So, this normal assumption is particularly required to test several hypothesis. On regression coefficients and also to find the confidence interval for the regression coefficients. Now, in the previous model called model adequacy checking, there we have studied different techniques to check whether the basic assumptions we made whether they are satisfied or not and the purpose of this module is that if the basic assumptions are not satisfied, if the sum of the assumptions are violated, then how we can handle the situation. So, here first we will recall particularly that residual plot was very important to check the basic assumptions and then we will study in this module how to handle the situation if some of the basic assumptions are not satisfied. So, first let me recall the basic assumptions. Residual plot which is a very important tool to check whether the assumptions are correct or not. So, suppose you are given a set of observation y i x i. So, y i is the response variable and x i is the regression variable and you are given n observations and then you know how to fit a simple linear regression model like this. So, y hat is equal to beta naught hat plus beta 1 hat into x and once you have this fitted model, you can find the residual. The i th residual is y i minus y i hat. So, this y i is the observed response and this is the estimated response and e i is the difference. This is called the residual and this is also you know more specifically this is called the regular residual. Now, what is residual plot? Residual plot is the plot of residual e i against the fitted response y i hat. So, here is the scatter plot of the residuals against the fitted response and this is the line e equal to 0 and if you see the residuals are sort of centered about the line e equal to 0, then the model is a satisfactory model and here this sort of plot you know suggest that the assumption variance sigma variance of epsilon is equal to sigma square is satisfied. Now, look at this scatter plot here forget about this two lines. So, if you see this is called the outward open funnel. If you see the residuals here, you can see the residual value increases as y i hat increases and this is called outward open funnel and in this situation if this occurs, then we sort of conclude that the constant variance assumption is not correct. So, we cannot assume that variance of epsilon i equal to sigma square for all i. So, this is not true and what actually happened here is that you know here the variance of epsilon i increases as y increases. Now, instead of this outward open funnel, it could be like you know inward open funnel that means e i the residual decreases as y i hat increases. So, in that case also the constant variance assumption is not satisfied and in case of inward open funnel variance of epsilon i, we say that variance of epsilon i decreases as y increases and the other situation could be this is called double bow. So, here you see the scatter plot of the residual and this sort of scatter plot occurs when y i is a proportion and y i y is the response variable y is in between 0 and 1. So, this sort of scatter plot also indicates that the constant variance assumption is violated and here is the final I mean the fourth scatter plot here this sort of scatter plot is called non-linear and this sort of non-linear scatter plot indicates that the relationship between the response variable y and the regressor variable x is not linear. So, now if you see that using the residual plot or some other technique you learn in model adequacy checking, if you see that the model assumptions are violated then generally we specifically if the constant variance assumption is violated then the we consider some transformation either on the response variable or in the regression variable to make the variance constant. So, the usual approach to deal with inequality of variance is to apply suitable transformation to the response variable or regressor variable. So, first we will talk about the variance stabilizing transform variable. So, first we will talk about the variance stabilizing transformations. So, what we assume is that variance of epsilon is equal to sigma square. So, this is what called the constant variance assumption. So, now if the constant variance assumption is violated the cause is often that the response variable does not follow a normal distribution. So, let me take some example say example one or suppose the response variable y follows Poisson distribution with parameter lambda then we know that expected value of y is equal to lambda and also variance of y is lambda. So, here you see that the variance of the response variable is a function of expectation. So, this is nothing but lambda is nothing but expectation of y. So, in this case what we have to do is that we take some transformation on the response variable to make this variance constant. Now, if you take the transformation say y prime which is equal to root y and then you regress y prime which is equal to root y on x and you can check that it is not difficult to check that variance of root y is independent of independent of mean variable okay. So, the second example example two suppose the response variable y is a proportion in between 0 to 1 and when y is a proportion between 0 and 1 we have seen in the residual plot that the residual plot sort of follow double bow pattern and here we take the transformation y prime which is equal to sin inverse square root of y to make the variance of variance of the response variable constant well. So, if you see that the constant variance assumption is violated then in that case you know most likely that the response variable is not from the normal distribution it follows some other distribution where the variance is a function of mean and we have observed in case of Poisson distribution if the response variable y follows Poisson distribution then the transformation we took is that y prime is equal to square root of y to make the variance constant it is not difficult to check that variance of y prime is sorry variance of y prime which is equal to variance of square root of y is constant and similarly in case of if y is a proportion between 0 and 1 then we take the transformation y prime is equal to sin inverse square root of y and the question is how do you decide about this which transformation to take. So, in this variance stabilizing transformation we will learn this we will talk about how to decide on which transformation to take to make the variance constant. So, y is response variable and y has mean mu and variance sigma square and the situation is that this variance sigma square is a function of mu g mu. So, that means variance of the response variable y is not constant it depends on mean and now the question. So, in case of Poisson distribution the variance was equal to mean and in that case in case of Poisson distribution. So, this g is identity function. So, depending on this g we will try to find a transformation on y say call it f y such that the variance of f y is constant it does not depend on mu. So, how to find this transformation f on y. So, that the variance of f y is constant well. So, you call it u we call to f y. Now, let me talk about Taylor series Taylor series of real or constant. So, this is the variance of real or constant. So, this is complex function say f x that is infinitely differentiable in a neighborhood of of a real or complex number say a is f x equal to f a plus f prime of f y. So, this is the variance prime a by one factorial x minus a plus f double prime this is the double derivative at a by two factorial into x minus a whole square like this. So, this Taylor series is is a polynomial approximation of the function f at a neighborhood of a point a. So, here we are looking for a transformation f on y and what we do we do not know what is this function. So, I mean we do not have the idea about what is this function at this moment. Let me just write using the Taylor series expression f x equal to f mu plus f prime mu by one factorial into y minus mu. So, I am considering Taylor series of f y up to the first term and this is in the neighborhood of mu and I am ignoring the higher order terms. Now, I am considering the Taylor series of we will calculate the variance of you want to make the variance of u constant. So, the variance of u which is equal to the variance of f y this is equal to f prime mu whole square into variance of y I hope you know you understand that variance of this is equal to this quantity because because variance of y minus mu is nothing, but variance of y and this variance of y is a function of mu. So, this one is equal to f prime mu whole square into g mu replacing variance of y which is equal to g mu. Now, if we choose the function f such that f prime mu square is equal to 1 by g mu then the variance is equal to 1. So, if you can choose a function f such that this is true then you are done. So, this is equivalent to f prime mu is equal to g mu to the power of minus 2. Then if this is if you can find a function f such that f prime mu is equal to this because g mu is given then variance of the transform random variable. So, we are looking for a transformation of y such that f y has constant variance then variance of u which is nothing, but variance of f y is equal to f prime mu whole square 1. So, let me give one example to illustrate this idea suppose the variance of y sigma square is approximately or proportional to k mu time q. So, that means what I am trying to say is that g mu is this quantity. So, g mu variance of sigma square variance of y which is sigma square is a function of mu and that function is equal to k mu to the power of q. What we want is that so we want f prime y equal to g y to the power of minus 2. So, looking for a function f or transformation on y that is f such that this is true. So, this is equal to so this then f prime y is proportional to mu to the power of minus q by 2. Let me check you know maybe I have made some mistake here. So, I made a mistake here. So, this is so f prime mu is to the power of minus half. So, here it is to the power of minus half and this quantity. So, we want a transformation f such that this is true and from here this is y. So, now you can check that f y is proportional to y to the power of 1 minus q by 2 if q is not equal to 2 because if you take the derivative of this one you will get back this one and it is log of y if q equal to 2. So, here is the transformation. So, what you see here is that if you see the response variable has variance sigma square which is a function of mu to the power of q which is a function of mu and the function is mu to the power of q then the transformation you have to consider is this one. Now, I will talk about several commonly used transformations. Suppose the relationship of sigma square and E y and the transformation here. Suppose the relationship between variance and mean is this sigma square is proportional constant then you do not need to take any variable transformation no transformation because the constant variance assumption is satisfied here. Suppose sigma square is proportional to expectation of y that is the case where y follows Poisson distribution and here you can check that mu the variance is proportional to mu that means q is equal to 1. So, you put q equal to 1 here right and then you put q is equal to the transformation you have to take is f y f y is equal to y to the power of half. So, that is root y square root of y. Now, if sigma square is proportional to expectation of y is proportional to expectation of y square then q is equal to 2 here. So, you put q equal to 2 here and then sorry q equal to 2 here and then f y is equal to log y and this is the case when y follow y follows exponential distribution this is the case of y follows Poisson distribution. Now, if sigma square is proportional to expectation of y to the power of 3 then q is equal to 3 and the transformation f y is equal to you can check that this is y to the power of minus half. Similarly, if x sigma square is proportional to expectation of y to the power of 4 then f y the transformation you have to take on y to make it constant variance is 1 by y. I mean this is if the function g mu is of this form you can find the transformation very easily, but suppose if sigma square is proportional to expectation of y into 1 minus expectation of y this is the case when y is a proportion between 0 and 1. So, in this case f y is equal to sin inverse root y. So, this is the this is all about variance stabilizing transformation. So, what is the basic message from this technique is that if the constant variance assumption is violated then most probably the response variable is response variable follows some some other distribution not normal distribution like Poisson distribution or it might be the proportion. And here we talked about if the constant variance assumption is not correct or if the assumption is violated then we learned about the technique how to transform the response variable to get constant variance. So, next we will talk about transformation to transformations to linearize the model. So, here what happened is that you know given a set of data one response variable and one regressor variable or several regressor variable. We assume a linear relationship between the response variable and the regressor variable and the assumption of this linear relationship is just a starting point you know occasionally this assumption might not be correct. So, if the relationship between response variable and the regressor variable is not linear how to detect that. So, the best technique to detect or you know to get some idea about the relationship between the response variable and the regressor variable is the scatter plot of response variable and regressor variable or one can also go for the residual plot. So, we will talk about several non-linear relationship here and there are some non-linear relationship between the regressor variable and the response variable which can be linearized easily by using some suitable transformation. So, here if the relationship between the variable is between response variable and regressor variable is non-linear. So, the non-linearity may be detected by a scatter plot or residual. Let me give one example like example one. If the scatter plot of x of y on x suggest an exponential relationship between x and y then the appropriate model would be model would be y equal to beta naught. So, this is e to the power of x beta 1. So, what it says is that given a set of observation say regressor variable and the response variable you first find the scatter plot and if you see the scatter plot indicate sort of suppose this is the scatter plot well. So, if the scatter plot indicates that the relationship between y and x is exponential then the appropriate model for this one is y equal to beta naught e to the power of x beta 1. And this is the scatter plot this is the exponential relationship between y and x when beta 1 is greater than 0 and it could be like this also. This is also suggest this also suggest exponential relationship between response variable and regressor variable, but here beta 1 is negative. And if the model is the relationship between y and x is exponential then this model is in fact linear. This model is linear because this is equivalent to the model log y equal to beta naught e to log beta naught plus beta 1 x. So, the transformation here the transformation here you are taking is y prime equal to log y that is all. So, the final model is y prime equal to beta naught prime plus beta 1 x. So, even if you see that the relationship is exponential between the response variable and regressor variable then the appropriate model is this one. And then you can transform this model to this model to a linear model. So, here given y x you transform y to log y log y and x and then you then you fit a linear model using this data using this transform data. So, a function that can be linearized to by using a suitable transformation is called linearizable transformation is called linearizable function. So, there are some functions which can be linearized by using a suitable transformation very easily those are called linearizable functions. Next, let me talk about some more example. So, you are given response variable and the regressor variable you fit the scatter plot. So, if you see the scatter plot is sort of you know centered about the line. So, this is the line y equal to x then you go for of course, you go for linear fitting. But, if you see the scatter plot is centered about this curve or maybe the scatter plot is centered about this curve. And then the relationship between response variable and regressor variable is sort of polynomial it is not a linear relationship. It relationship is y equal to beta naught x to the power of beta 1. So, this one is the case when beta 1 is greater than 1 this is the case when beta 1 is equal to 1 and this is the case when beta 1 is less than 1, but greater than 0. So, as I told before there are some non-linear relationship which can be easily transform into linear form these are called linearizable function. So, this one is also a linearizable function because you can easily make it linear by taking log function log y is equal to log y is equal to beta naught plus beta 1 log x. So, the transformation here you are choosing is the transformation are y prime is equal to log y and x prime is equal to log x and the final model is y prime is equal to beta naught prime plus beta 1 x prime. So, given the data y x you transform that given data to log y log x and then you can see you know if you plot the scatter plot for this transform data perhaps you will get the scatter plot centered about a straight line and you can go for a linear fit. So, one more example is here. So, this is the scatter plot of y against x and if you see the scatter plot similar to this you know which is centered about this curve then the relationship between y and x here is y equal to beta naught and plus beta 1 log x for beta 1 greater than 0 and this one is same this one is for beta 1 less than 0 and it is very easy to realize that this is linearizable function because because here you just take the transformation x prime equal to log x and then the model become y equal to beta naught plus beta 1 x prime. So, given the data x y is equal to beta 1 you transform that to log x y and fit this linear model and I give one more example and then stop here. Suppose you scatter plot is centered about this curve or this curve then the relationship between y and x is again a linearizable function here y is equal to x by beta naught x minus beta 1 and this one is for beta 1 greater than 0 and this one is for beta 1 less than 0 and you can transform this to linear function because yes you can take say what is 1 by y 1 by y is equal to beta naught x minus beta 1 by x and so here 1 by y is equal to beta naught minus beta 1 by x. So, what transformation you are taking here is that transformation is y prime equal to 1 by y and x prime is equal to 1 by x and the final model is y prime equal to beta naught minus beta 1 x prime. So, what you have to do is that given the data x y if you see the scatter plot is similar to this one then you take the transformation 1 by x transform data 1 by x 1 by y and fit a state model. So, this is what we want to mean by linearizable function. So, if you see the relationship between the response variable and the regression variable is not linear and if it is similar to one of these thing you can take some easy transformation on the variable and then the variable and then the problem is equivalent to fitting a linear model between the response variable and regression variable. So, thank you for your attention.